This talk addresses the multi-object tracking problem. It assumes prior detections of the targets, and uses a graph-based approach to connect detections across time. As a main fundamental contribution, we introduce an original iterative aggregation strategy, which validates non-ambiguous matching first, based on local hypothesis testing about the target appearance. Specifically, each iteration considers a node, named key-node, and investigates how to aggregate it either with previous or subsequent nodes, assuming that the appearance of the key-node is the appearance of the target. In practice, the aggregation is investigated by computing shortest paths within the key-node neighborhood, and the shortest aggregation path is validated for subsequent iterations of the algorithm only when it is considered to be sufficiently better than alternative aggregation options. The approach is multi-scale in the sense that the size of the investigated neighborhood is increased proportionally to the number of detections already aggregated into the key-node. Two main advantages arise from the proposed strategy. On the one hand, by making a (different) hypothesis about the target appearance at each iteration, our framework can benefit from appearance features that are sporadically available, or affected by a non-stationary noise, along the sequence of detections. Whilst those kind of features are frequent in many practical real-life scenarios, to the best of our knowledge, our work is the first one to exploit them without making any a priori assumption about the possible appearances of the tracked objects. Second, the multi-scale and iterative nature of the process makes it both computationally efficient and effective, which is demonstrated through extensive experimental validations.