Using the Crossvalidation service

Service procedure

The crossValidation (xval) process splits the sample data into subsets such that the analysis is initially performed in a single subset, while the other subset(s) are retained for subsequent use in confirming and validating the initial analysis.

In K-fold xval the original data is partitioned into K subsamples. The crossvalidation is then repeated K times. If K == number of observations we have a Leave-One-Out CrossValidation 'LOOCV', where a single observations from the original data is removed and its value estimated using the remaning set. The xval process has a default K-fold of 10. The use of LOOCV is not advised since it is extremely time consuming

The xval process is designed to use the same inputs/outpus of the INTAMAP system. Therefore it uses UncertML and Obs&Meas as major XML structures. The xval service runs as a complete python script integrated in a pyWPS implementation, therefore it is not possible to port it to another system architecture as an API (like the INTAMAP-System). This the xval service can only exist as a web service

Currently a xval service is available at:

http://remwps2.jrc.ec.europa.eu/cgi-bin/wps.py

Input

Several xval request examples can be found in the SVN tree.

The examples are based on the Meuse data set and INTAMAP-System interpolation requests
Using the command wget it is possible to send a request and obtain a WPS response with the result or the status document: wget -O - -n -q --post-file= http://remwps.jrc.it/cgi-bin/wps.py

Basic input

Following the INTAMAP philosophy, the xval basic input is minimal, a simple data set as Obs&Meas in the XML request will generate a response. In this case the service will use all the automtic settings located in the interpolation service, the interpolation model will be picked automatically and it will use the default k-fold 10 for crossvalidation
Complete list of possible inputs: K fold value is the only specific xval service parameter, the remaining inputs will be forwarded to the interpolation service, therefore a user can specify what interpolation server (URL location), what process name (org.intamap.wps.Interpolate,what processname (automatic, automap, psgp etc), extra method parameters and max time to wait for an interpolation model reply

InterpolationServer

Any server with INTAMAP-system can be used as the interpolation service that will be consumed by xval. Since both systems are indenpent it is not a problem for the same phisical server to serve both systems.

MethodParameters

xval is normally performed after an interpolation procedure with the objective of assessing the quality of results. Interpolation requests to the INTAMAP service will output a MethodParameters string (see: tryAPI) that contains all the information/metadata used to interpolate the dataset. This string is reused inside the xval service as the methodParameters WPS input.