PLoS ONE (Jan 2019)

Feed-forward visual processing suffices for coarse localization but fine-grained localization in an attention-demanding context needs feedback processing.

  • Sang-Ah Yoo,
  • John K Tsotsos,
  • Mazyar Fallah

DOI
https://doi.org/10.1371/journal.pone.0223166
Journal volume & issue
Vol. 14, no. 9
p. e0223166

Abstract

Read online

It is well known that simple visual tasks, such as object detection or categorization, can be performed within a short period of time, suggesting the sufficiency of feed-forward visual processing. However, more complex visual tasks, such as fine-grained localization may require high-resolution information available at the early processing levels in the visual hierarchy. To access this information using a top-down approach, feedback processing would need to traverse several stages in the visual hierarchy and each step in this traversal takes processing time. In the present study, we compared the processing time required to complete object categorization and localization by varying presentation duration and complexity of natural scene stimuli. We hypothesized that performance would be asymptotic at shorter presentation durations when feed-forward processing suffices for visual tasks, whereas performance would gradually improve as images are presented longer if the tasks rely on feedback processing. In Experiment 1, where simple images were presented, both object categorization and localization performance sharply improved until 100 ms of presentation then it leveled off. These results are a replication of previously reported rapid categorization effects but they do not support the role of feedback processing in localization tasks, indicating that feed-forward processing enables coarse localization in relatively simple visual scenes. In Experiment 2, the same tasks were performed but more attention-demanding and ecologically valid images were used as stimuli. Unlike in Experiment 1, both object categorization performance and localization precision gradually improved as stimulus presentation duration became longer. This finding suggests that complex visual tasks that require visual scrutiny call for top-down feedback processing.