gwlearn.ensemble.GWRandomForestClassifier#

class gwlearn.ensemble.GWRandomForestClassifier(*, bandwidth, fixed=False, kernel='bisquare', n_jobs=-1, fit_global_model=True, measure_performance=True, strict=False, keep_models=False, temp_folder=None, batch_size=None, min_proportion=0.2, undersample=False, random_state=None, verbose=False, **kwargs)[source]#

Generic geographically weighted random forest classifier

NOTE: local models leave out focal, unlike in traditional approaches. This allows assessment of geographically weighted metrics on unseen data without a need for train/test split, hence providing value for all samples. This is needed for futher spatial analysis of the model performance (and generalises to models that do not support OOB scoring).

Parameters:

bandwidthint | float: bandwidth value consisting of either a distance or N nearest neighbors
fixedbool, optional: True for distance based bandwidth and False for adaptive (nearest neighbor) bandwidth, by default False
kernelstr | Callable, optional: type of kernel function used to weight observations, by default “bisquare”
n_jobsint, optional: The number of jobs to run in parallel. -1 means using all processors by default -1
fit_global_modelbool, optional: Determines if the global baseline model shall be fitted alognside the geographically weighted, by default True
measure_performancebool, optional: Calculate performance metrics for the model, by default True
strictbool | None, optional: Do not fit any models if at least one neighborhood has invariant y, by default False. None is treated as False but provides a warning if there are invariant models.
keep_modelsbool | str | Path, optional: Keep all local models (required for prediction), by default False. Note that for some models, like random forests, the objects can be large. If string or Path is provided, the local models are not held in memory but serialized to the disk from which they are loaded in prediction.
temp_folderstr | None, optional: Folder to be used by the pool for memmapping large arrays for sharing memory with worker processes, e.g., /tmp. Passed to joblib.Parallel, by default None
batch_sizeint | None, optional: Number of models to process in each batch. Specify batch_size fi your models do not fit into memory. By default None
min_proportionfloat, optional: Minimum proportion of minority class for a model to be fitted, by default 0.2
undersamplebool, optional: Whether to apply random undersampling to balance classes, by default False
random_stateint | None, optional: Random seed for reproducibility, by default None
verbosebool, optional: Whether to print progress information, by default False
**kwargs: Additional keyword arguments passed to model initialisation

__init__(*, bandwidth, fixed=False, kernel='bisquare', n_jobs=-1, fit_global_model=True, measure_performance=True, strict=False, keep_models=False, temp_folder=None, batch_size=None, min_proportion=0.2, undersample=False, random_state=None, verbose=False, **kwargs)[source]#

Methods

`__init__`(*, bandwidth[, fixed, kernel, ...])
`fit`(X, y, geometry)	Fit the geographically weighted model
`predict`(X, geometry)
`predict_proba`(X, geometry)	Predict probabiliies using the ensemble of local models

fit(X, y, geometry)[source]#

Fit the geographically weighted model

Parameters:

Xpd.DataFrame: Independent variables
ypd.Series: Dependent variable
geometrygpd.GeoSeries: Geographic location

gwlearn.ensemble.GWRandomForestClassifier

Contents

gwlearn.ensemble.GWRandomForestClassifier#