Download and installation

Quick Installation guide

The crossvalidation service is a WPS service designed to test the interpolation results of the INTAMAP system, generating statistical values of the interpolation errors. An explanation of cross-validation in statistics can be found in wikipedia Basically, the crossvalidation service will accept a dataset and send it to INTAMAP system to determine the best spatial model that describes the dataset, after it will split the dataset into a training and validating sets. The training set will be used to interpolate the validation set using the initial spatial model. In the final stage the service will compare the true values of the validation set and the ones returned by the interpolation service.

General descriprion

The crossvalidation implementation is indepent from the INTAMAP system and it is based in different techonologies but respecting the followed protocols and standards of web services

INTAMAP system runs on a Tomcat-Rserver-IntamapPackage stack, while the crossvalidation uses an Apache-pyWPS/python-R stack. Crossvalidating a data set is a basic web service procedure where one service (the crossvalidation) works as service provider but also as a service consumer. The crossvalidation service will accept a data input and will generate a crossvalidation result using a python script and some specific calls to R for graphical creation using the Rpy module, therefore the R session is run localy and not as a server service (as in Rserve)

pyWPS

pyWPS is a WPS implementation written in python whose objective is the implementation of GRASS-GIS tools as web services and the use of a service oriented architecture, nevertheless this implementation will provide WPS support for any python script. It's webpage can be found in: http://pywps.wald.intevation.org/, an for questions that aren't answer in the site or in case of serious error it is recomend the subcription of the general pyWPS mailing list The pyWPS should be installed from the SVN tree since the SVN more updated and more bug free than the debian packages. The source code can be fetched using the SVN command in a common bash prompt

crossvalidation process

pyWPS implements processes as a python class inside the pyWPS process module. When the wps.py is called the processes/classes are loaded and dealt according to the request. The crossvalidation service script can be downloaded from the project SVN tree (the service is contained in the crossvalidation folder)and should be installed as explained in the pyWPS website.

The crossvalidation script requires the following packages:

It is important that Rpy2 is properly installed and that python can connect to it, if the following doesnt produce an error message than everything is ok. python -c "import rpy2;import rpy2.robjects as robjects;R=robjects.r" Testing of the crossvalidation can be done either as a cgi-bin or bash command, as bash command: /usr/bin/wps.py "service=wps&request=getcapabilities" cgi-bin in the browser: http:///cgi-bin/wps.py?service=wps&request=getcapabilities If you want to sent a XML file with a WPS request, you have to use the cat command: cat | /usr/bin/wps.py If there is any problems either in the WPS or in Python they will be reported as WPS exceptions or in between XML comment tags. One common problem is the loading of R libraries by Rpy2 and the crossvalidation script. One fast solution is to set the R_LIBS and R_HOME in the script it self. Around line 88 there is the comment: #os.environ["R_LIBS"]="/usr/lib/apache2/Rlibs" to uncomment is enought to remove the "#". This command will set the R_LIBS environment variable to the path were the Rlibs are, this path can change from one install to the other. Also it is possible to set the R_HOME using the same command. Note that this os.environ will only work after the import of Python's os module (import os). Depending on the system and configuration it maybe a problem for pyWPS to locate the process directory or the configuration file. This can be solved using a wrapper script. In this case the wps.py is called from inside a script that first defines the environment variables