Appropriateness measurement in nonparametric item response theory modeling is affected by the reliability of the items, the test length, the type of aberrant response behavior, and the percentage of aberrant persons in the group. The percentage of simulees defined a priori as aberrant responders that were detected increased when the mean item reliability, the test length, and the ratio of aberrant to nonaberrant simulees in the group increased. Also, simulees "cheating" on the most difficult items in a test were more easily detected than those "guessing" on all items. Results were less stable across replications as item reliability or test length decreased. Results suggest that relatively short tests of at least 17 items can be used for person-fit analysis if the items are sufficiently reliable. Index terms: aberrance detection, appropriateness measurement, nonparametric item response theory, person-fit, person-fit statistic U3.