In this paper we introduce two NLG systems that we developed for the GIVE challenge, which was aimed at the evaluation of natural language generation (NLG) systems. The Challenge involved automatically generating instructions for users to carry out a task in a 3D game environment. One of our systems focused on generating optimally helpful ‘serious’ instructions while the other focused on entertainment, providing more playful instructions. We used the data gathered in the Challenge – both subjective user ratings and objective task performance data – to compare the efficiency and entertainment value of both systems. We found a clear difference in efficiency, but were unable to prove that one system was more entertaining than the other. This could be explained by the fact that the set-up and evaluation methods of the GIVE Challenge were not aimed at measuring entertainment. Based on our experiences, we give some suggestions for the set-up of future installments of the Challenge.