Submitting your planner
This page explains how to submit the planners to us, and how the planners will be invoked during the competition. Please pay close attention to this: due to the large number of participants, we will require that the rules on this page are followed strictly.
We remind you that the planner submission deadline is May 31st 2008. We follow the ICAPS rules regarding time zones: You submission is in time as long as there is some time zone in which it is still May 31st. In other words, the deadline is 11:59 PM UTC-10 (Honolulu time). The deadline is strict. No extensions will be granted.
Sending us your planner
Planners are submitted by email to Malte Helmert <helmert AT informatik DOT uni DASH freiburg DOT de>.
- Your email should contain the planner either as an attachment, or provide a download link.
Your email should contain a filled in PDDL support questionnaire, which describes the level of PDDL support of your planner. If there are additional limitations not captured by the questionnaire, please indicate them, too.
- It is OK to submit several copies of your planner, for example if you have a running version that you want to submit by May 30th, and then implement some last-minute improvements on May 31st. In that case, we will use the last version of the planner that arrived before the deadline.
- The planner must be submitted as a zip, tar.gz or tar.bz2 archive.
- The archive must include the full source code of your planner, to be published on this web site after the competition. It must not contain any executables.
The archive must extract to a number of directories, one for each track in which you participate. So if you only participate in a single track, the archive must extract to a single directory. The directories must be named as follows (where planner-name is the name of your planner):
seq-sat-planner-name for the sequential satisficing track
seq-opt-planner-name for the sequential optimization track
tempo-sat-planner-name for the temporal satisficing track
tempo-opt-planner-name for the temporal optimization track
netben-sat-planner-name for the net benefit satisficing track
netben-opt-planner-name for the net benefit optimization track
Please follow Linux naming conventions (lower-case, no spaces) for the names of the directories. For example, if your planner is called "Fast Downward" and participates only in the satisficing sequential track, a good name for the directory would be seq-sat-fast-downward. Please create separate directories for the different tracks even if the planners for the different tracks are completely identical. Each directory must be self-contained and relocatable. Don't hardcode absolute paths anywhere, and don't use symbolic links that leave the planner directory.
- Some teams submit two planners, or two versions of the same planner. In that case, it is OK to submit them inside a single archive. In that case, create one set of directories for each planner or planner version. Again, please create several directories even if the planners/planner versions only differ in very minor things (e.g. command-line switches that must be used to run the planner).
Planner directories should not contain any unnecessary files (editor backup files, .CVS or .svn directories, .DS_Store files, object files, bytecode, ...), but README files that may help with trouble-shooting the planner are appreciated.
Example: You submit the planners "Find Plans Quickly" that participates only in the satisficing temporal track, and "Find Optimal Plans Slowly" that participates in both temporal tracks. Then your archive should contain the following directories:
tempo-sat-fpq tempo-sat-fops tempo-opt-fops
Compiling your planner
Each planner directory should contain a shell script, named build, which completely builds your planner. Be sure that your build script is executable. You may assume that build is run from the directory in which it presides. In the common case that you want to use the make tool to build your planner, your build script should look like this:
#! /bin/bash make
- Your build script will be run with limited user rights, but still please make sure that it doesn't contain any operations that can wreak havoc on the computer. In particular, it must not write to any directories outside the directory it is run in (creating and using subdirectories is fine), and it must not use the network.
- If there are reasons to expect that your planner won't build on a standard Linux machine (e.g. because it uses unusual libraries), please explain any potential issues in your submission email.
Example: You submitted an archive named fpq-fops.tar.bz2 that contains the "Find Plans Quickly" and "Find Optimal Plans Slowly" planners mentioned above. In that case, we will build your planners as follows:
tar xjf fpq-fops.tar.bz2 cd tempo-sat-fpq ./build cd .. cd tempo-sat-fops ./build cd .. cd tempo-opt-fops ./build cd ..
Running your planner
Each planner directory should contain a shell script named plan which accepts three arguments: a domain file, a problem file, and the filename in which the result plan should be stored.
- Invoking the script with these three arguments should run your planner. You may assume that the planner is run from the directory in which it presides. You may also assume that the input files and output filename are contained in this directory (and not in a subdirectory). We will run the script in an environment that limits memory usage to 2 GB and overall runtime to 30 minutes, so you don't need to take such measures manually.
- Your planner may write whatever it wants to the stdout and stderr streams. Diagnostic output to these streams will be logged, and we encourage you to produce any output that may help in troubleshooting the planner.
The solution should be written to the result file in a format understood by the VAL validator. (At the time of this writing, VAL doesn't yet support the new features in PDDL 3.1. We will of course use an updated version of VAL which does support these features for validation.)
- Your planner will be run with limited user rights, but still please make sure that it doesn't contain any operations that can wreak havoc on the computer. In particular, it must not write to any directories outside the directory it is run in (creating and using subdirectories is fine), and it must not use the network.
If your planner generates any temporary files, we will automatically clean these up after each planner run, restoring the planner directory to its previous state. Don't create temporary files with names ending with .pddl, .soln or .log, as we will use such names for inputs, outputs, and redirected stdout/stderr streams, respectively.
Example: The executable for the "Find Plans Quickly" planner is called fpq and accepts the same command-line arguments as the FF planner. Then your plan script may look as follows:
#! /bin/bash ./fpq -o $1 -f $2 mv $2.soln $3
Special planner aspects
Some technical aspects of a planner may make the evaluation more complicated. Please check the following list to see if any of the following points applies to your planner. In this case, please clearly indicate this in the submission email, and give us the detailed information we need in order to run the planner.
Anytime algorithms: In the satisficing tracks, if your planner produces multiple plans, please don't reuse the same filename for the generated plans, because this will lead to problems if your planner times out in the middle of writing a new plan. Instead, append .1 to the plan filename for the first plan that is generated, .2 for the second, and so on. We will evaluate the last complete plan that was generated, so please only output plans in increasing order of quality. (Of course, there is no point in producing lower-quality plans than ones that were previously generated anyway.) For example, if your any-time planner is called as ./plan domain.pddl problem.pddl plan.soln, the first plan it generates should be named plan.soln.1, then plan.soln.2, and so on.
Randomized algorithms: If your planner uses randomized algorithms, please initialize the random seed to a fixed constant such as 2008. If there are any reasons to expect that your planner won't generate reproducible results, please tell us clearly in the submission email.
Concurrency: If your planner spawns concurrent threads or processes, please tell us. (Note that the time limit is on overall CPU time, not wall-clock time, and that the memory limit is for the summed up memory usage of any concurrent processes you run. So there is little reason why your planner should use concurrent processes.) There is no need to inform us if processes are spawned sequentially, e.g. if your planner script first runs a preprocessing module and then the actual planner. (Of course, time spent by the preprocessing module counts towards the total allocated CPU time.)
Disk usage: If your planner uses external search algorithms, we will need to run it in an isolated environment, so please tell us. In that case, we will of course take time spent doing I/O into account for the time limit. Please also tell us how much hard disk space the planner should be expected to require, at maximum, during execution. The planner is only allowed to write to the directory in which it is invoked, or subdirectories thereof. (For example, don't write to /tmp.)
Alternative domain formulations
Each evaluation domain for the competition will come in several alternative formulations that use different subsets of the optional features for that track (including, of course, formulations that use none of the optional features). For example, we will provide formulations with and without derived predicates, and formulations with and without object fluents. In some cases, it is clear which formulation is most appropriate for a planner (e.g., a planner that does not support derived predicates will only use formulations that don't make use of them). In other cases, the decision is not as clear. For example, a planner might support conditional effects, but still prefer domain formulations that don't use them in some cases.
In previous competitions, participants could manually choose which of the alternative formulations their planners would use in each domain. Due to the blind evaluation mode, this won't be possible this year. Instead, we will choose the "best" formulation for each planner in each domain by first using probing runs, where each planner attempts each formulation of each task in each domain with a reduced timeout of 1 minute. For each domain, we check for which of the formulations the planner achieved the best score in the probing runs, according to the evaluation criteria of the competition. This formulation is then considered the "best" formulation of that domain for that planner, and the results of the probing runs are thrown away. In the final evaluation (which then uses a timeout of 30 minutes), the planner will only be evaluated against the formulation selected earlier. (In case of ties, we will prefer formulations that use a larger subset of PDDL features, because most planners perform better if fewer features need to be compiled away.)
We reserve the right to run some planners against multiple formulations in the final evaluation and only use the best result for scoring, for example in cases where two planners achieve very similar scores or in cases where the probing runs produce very erratic results. In any case, formulations are chosen per-domain, not per-instance.
If you feel that this automated method won't lead to optimal selection of domain formulations for your planner, please bring up the issue in the submission email so that we can work out an alternative policy.
Bug fix policy
In some cases, we will offer the opportunity to fix bugs that arise during the evaluation period, but any changes after the submission deadline will be strictly limited to bugfixes only. We will use a diff tool to check that patches don't contain new features or parameter tuning, and will reject patches that don't look like minimal changes to fix bugs. It is your responsibility to provide patches that are easy to verify with a diff tool. We reserve the right to reject changes for which the only-bugfixes rule is unnecessarily hard to check (e.g. because you reformatted the whole code).