Time series analysis of Landsat data is widely used for assessing forest change at the large-area scale. Various change detection algorithms have been proposed, each employing different techniques to characterise abrupt disturbance events and longer term trends. However, results can vary significantly, depending on the algorithm, parameters and the spectral index (or indices) chosen. This mismatch in results has led to researchers hypothesizing that an ensemble based approach may increase accuracy. In this study we assess two change detection algorithms (LandTrendr and the R package strucchange), each with three indices (the Normalized Difference Vegetation Index or NDVI, the Normalized Burn Ratio or NBR, and Tasseled Cap Wetness or TCW). We test their ability to detect abrupt disturbances in sclerophyll forests over a 29 year time period, and subsequently evaluate a number of ensembles, using simple fusion rules and Random Forests models. A total of 4087 manually interpreted reference pixels, sampled from 9 million ha of forest, were used for training and validation. In addition, we assess the effects of priming the Random Forests classifier with confusing cases (commission errors from the time series algorithms). Our results clearly show that ensembles combining multiple change detection techniques out-perform any one method. Our most accurate Random Forests model, using an ensemble of all 6 algorithm outputs, along with 3 bi-temporal change rasters (change in NBR, NDVI and TCW), had an overall error rate of 7%, compared with the most accurate single algorithm/index approach (LandTrendr with NBR), which had an overall error of 21%. Our findings also indicate that acceptable results (14% error) can be achieved without the use of traditional change detection algorithms, by using robust reference data and Random Forests classification. However, by priming the classifier with confusing cases informed by the change detection algorithms, commission errors decreased substantially, at the expense of slight increases in omission errors. In fact, a Random Forests ensemble, using the primed training data and only 3 bi-temporal change rasters, was more accurate than any one individual algorithm, with an overall error of 11%. By including some additional metrics derived from Landsat time series (e.g. 2 year changes and overall means) this error was further reduced to 8%. Given that most change detection algorithms have large processing requirements, this suggests that algorithms can be applied to a sample of pixels only, for the sole purpose of training a machine learning classifier. We demonstrate the feasibility of this previously unexplored approach, by creating annual disturbance maps over a large area of forest (9 million ha) and long time period (29 years).