Objective tools that can assess the demands associated with in-vehicle human machine interfaces (HMIs) could assist automotive engineers designing safer interaction. This paper presents empirical evidence supporting one objective assessment approach, which compares the demand associated with in-vehicle tasks to the demand associated with “benchmarking” or “comparison tasks”. In the presented study, there were two types of benchmarking tasks-a modified surrogate reference task (SuRT) and a delayed digit recall task (n-back task) - representing different levels of visual demand and cognitive demand respectively. Twenty-four participants performed these two types of benchmarking tasks as well as two radio tasks while driving a vehicle on a closed-loop test track. Response measures included physiological (heart rate), glance metrics, driving performance (steering entropy) and subjective workload ratings. Results suggested that multiple types and levels of benchmarking tasks can be used to give insight about reference demand levels for real, in-vehicle tasks. The four levels of visual-loading task (the modified SuRT) had graded responses in glance measures, steering entropy and subjective workload ratings. The cognitively-loading task (the n-back task) generated scalable responses in heart rate and subjective workload ratings. The range of visual and cognitive demands from the radio tasks could be usefully compared to the two benchmark tasks. Age was also found to be a significant mediating factor for secondary-task-associated workload based on all response measures. Results also suggested driving performance and physiological measures extend the understanding from benchmarking tasks beyond that yielded by subjective workload ratings.