*Result*: Toward a better understanding of target distinctiveness in visual search: How color, shape, and texture information combine to guide search.

Title:
Toward a better understanding of target distinctiveness in visual search: How color, shape, and texture information combine to guide search.
Authors:
Xu ZJ; Department of Psychology, University of Illinois Urbana-Champaign., Lleras A; Department of Psychology, University of Illinois Urbana-Champaign., Buetti S; Department of Psychology, University of Illinois Urbana-Champaign.
Source:
Journal of experimental psychology. General [J Exp Psychol Gen] 2026 Mar; Vol. 155 (3), pp. 839-875. Date of Electronic Publication: 2026 Jan 22.
Publication Type:
Journal Article
Language:
English
Journal Info:
Publisher: American Psychological Assn Country of Publication: United States NLM ID: 7502587 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1939-2222 (Electronic) Linking ISSN: 00221015 NLM ISO Abbreviation: J Exp Psychol Gen Subsets: MEDLINE
Imprint Name(s):
Original Publication: Washington, American Psychological Assn.
Grant Information:
National Science Foundation
Entry Date(s):
Date Created: 20260122 Date Completed: 20260205 Latest Revision: 20260205
Update Code:
20260205
DOI:
10.1037/xge0001895
PMID:
41569513
Database:
MEDLINE

*Further Information*

*People often search for objects distinctive from other objects in the scene along multiple feature dimensions like color and shape. A target distinctive in more than one dimension can lead to an easier search, but it also increases the complexity of modeling search behaviors. Building upon previous research on how people search using information along two feature dimensions, we explored how search unfolds when the target and distractors differ along the dimensions of color, shape, and texture (a tridimensional search). Using a behavioral-computational approach, we found that the target-distractor distinctiveness signal along each dimension combines in a weighted orthogonal way to guide tridimensional searches. Additionally, across two sets of experiments, we demonstrated that the weight assigned to each dimension varied according to its relative usefulness. When the color distinctiveness was most pronounced (Set 1), there was a much stronger prioritization of color information over information carried by shape and texture. When the distinctiveness along individual dimensions was more balanced (Set 2), the weights were distributed more evenly across the three dimensions, but a color prioritization remained. These results have broad implications for cognitive neuroscience, as they place constraints on how visual information from different dimensions is integrated to produce an overall guidance signal, and demonstrate how attention might be flexibly allocated across channels in response to the ecological aspects of the environment. This study should also interest modelers in cognitive science because it demonstrates an approach to understand behavior in complex scenarios based on performance indices estimated under simpler conditions. (PsycInfo Database Record (c) 2026 APA, all rights reserved).*

*

Toward a Better Understanding of Target Distinctiveness in Visual Search: How Color, Shape, and Texture Information Combine to Guide Search

<cn> <bold>By: Zoe (Jing) Xu</bold>
> Department of Psychology, University of Illinois Urbana-Champaign
> Department of Psychology, University of Washington
> <bold>Alejandro Lleras</bold>
> Department of Psychology, University of Illinois Urbana-Champaign
> <bold>Simona Buetti</bold>
> Department of Psychology, University of Illinois Urbana-Champaign </cn>

<bold>Acknowledgement: </bold>Timothy Vickery served as action editor.This article has been posted as a preprint and is available on the Open Science Framework. The preprint can be accessed at https://osf.io/preprints/osf/857u2. All data, analysis code, and research materials are available on the Open Science Framework (https://osf.io/bmwa4/). Results in Experimental Set 1 were presented at the 2021 Annual Meeting of the Vision Sciences Society.The authors have no competing interests. This project was supported by a grant from the National Science Foundation (Grant BCS 1921735 awarded to Simona Buetti).Zoe (Jing) Xu played a lead role in data curation, formal analysis, investigation, project administration, software, validation, visualization, and writing–original draft and an equal role in conceptualization, methodology, and writing–review and editing. Alejandro Lleras played a supporting role in formal analysis, funding acquisition, investigation, project administration, resources, validation, visualization, and writing–original draft and an equal role in conceptualization, methodology, supervision, and writing–review and editing. Simona Buetti played a lead role in funding acquisition, resources, and supervision, a supporting role in formal analysis, investigation, project administration, validation, visualization, and writing–original draft, and an equal role in conceptualization, methodology, and writing–review and editing.

The concept of target template is commonly used in many visual search theories, usually defined as the mental representation of the target that people are looking for in the visual environment (e.g., Chun & Wolfe, 1996; Malcolm & Henderson, 2009; Wolfe, 2021). The majority of theories agree that the target template facilitates search behaviors by imposing top-down guidance on the search processes (e.g., Adeli et al., 2017; Buetti et al., 2016; Bundesen, 1990; Duncan & Humphreys, 1989; Hoffman, 1979; Hulleman & Olivers, 2017; Liesefeld et al., 2018; Rosenholtz et al., 2012; Wolfe, 1994, 2021; Zelinsky, 2008; for exceptions that propose search being determined solely by bottom-up signals, see Itti & Koch, 2000; Theeuwes, 1991, 1992; Ullman, 1987). Some theories propose a feature-boosting mechanism, whereby a subset of features in the target template that provides the largest discriminability between the target and distractors in the display is boosted. The resulting activation map reflects the extent to which each location contains the boosted target features, determining the likelihood of each location receiving attention and getting scrutinized (e.g., Adeli et al., 2017; Bundesen, 1990; Liesefeld et al., 2018; Wolfe, 1994, 2021; X. Yu & Geng, 2019). Other theories propose that the overall similarity (i.e., the “match”) between the search object and the target template provides the top-down guidance that helps direct attention to the objects that are most likely to be the target (e.g., Buetti et al., 2016; Duncan & Humphreys, 1989; Hoffman, 1979).

While visual search theories tend to agree on the importance of the target template in search processes, there is a lack of consensus or even explicit discussion regarding how complex a target template can be or how much of the visual information contained within a target template can be utilized to guide attention. The majority of visual search theories do not discuss the limits on target template complexity (Adeli et al., 2017; Buetti et al., 2016; Duncan & Humphreys, 1989; Eckstein et al., 2000; Folk et al., 1992; Gaspelin & Luck, 2018; Hoffman, 1979; Hulleman & Olivers, 2017; Najemnik & Geisler, 2005; Navalpakkam & Itti, 2007; Wolfe, 1994; X. Yu & Geng, 2019; for a special case of target template, see Rosenholtz, 2016; Rosenholtz et al., 2012). Generally, these theories assume that the target template and the search items’ representations are constructed and compared over all the relevant dimensions (i.e., the dimensions that differentiate the target from the distractors, e.g., location, motion, color, surface texture, shape, and size). For instance, Navalpakkam and Itti (2007) argued that visual input is represented in multiple feature dimensions including but not limited to color, orientation, and direction of motion. Zelinsky (2008) used a 72-dimensional feature vector to represent visual signals coming from each location of the search scenes, where he modeled human eye movements. Consistent with this perspective, many empirical studies used real-world objects as search stimuli, where the target template receives little constraint in terms of its dimensionality or complexity. Wang et al. (2017) used teddy bears, reindeers, and car models as search stimuli and found that the difference between the target and distractors can guide search behaviors when the two differ along multiple dimensions in a complex way. Henderson et al. (2009) found that people were able to search for the target among distractors that were real-world objects appearing in real-world scenes (see also Malcolm & Henderson, 2009). These latter studies measured how people search for real-world objects that contain visual information from various dimensions, which suggests that people are capable of making comparisons between search items and a complex target template.

However, the ability to search for a target defined along multiple dimensions does not guarantee that the attentional guidance of the target template occurs along all those dimensions. Even though the target and distractors may differ along multiple dimensions, it is possible that only a subset of those dimensions actually guides the search. At first glance, this proposal might seem contradictory to the empirical findings that suggest more specific visual templates of the target provide stronger guidance than less specific or accurate templates (e.g., Malcolm & Henderson, 2009; Vickery et al., 2005; Wolfe et al., 2004). For instance, Vickery et al. (2005) found that cueing target images that differ in sizes or orientations from the actual targets led to slower searches, compared to cueing the exact target image that appears in the search display, suggesting that the detailed visual information stored in the target template, rather than general schematic or semantic information, guides the search process. However, the conclusion that a more specific target template provides stronger guidance arises from empirical evidence at two extremes of this “specificity” continuum: from being as abstract as a word or as distorted as an example differing from the true target in size or orientation to being as specific and precise as the exact image of the target. The optimal solution might lie somewhere in between—there might be a threshold for the amount of information in the target template that can be effectively utilized to guide the search, and additional details in the target template may not provide any additional guidance, or they might be utilized in a less efficient way.

Indeed, several visual search theories (Alexander et al., 2019; Liesefeld et al., 2018; Williams, 1967) suggest limits in how many features guide attention. Williams (1967) showed that when targets had multiple features, eye movements were mainly guided by color, even when shape and size were known. Only in specific cases, like when the target was the largest size, did size also influence fixations. This implies a limit on feature-based guidance.

More recently, Alexander et al. (2019) found that previewing a mismatched target, which differed from the actual target in either shape or orientation, caused a slowdown in visual search—but only when the stimuli were grayscale. When the target differed from distractors along the color dimension, previewing a mismatched target in shape or orientation caused little slowdown. This result suggests that people typically use color to guide their search and that shape and orientation become guiding factors only in the absence of color information.

More recently, there has been a proposal that there exist two distinct forms of target template: one that is primarily involved in providing initial guidance during the search process and a second one engaged in the actual identification of the target once an item is in the focus of attention (Hamblin-Frohman & Becker, 2021; Wolfe, 2021; X. Yu et al., 2022, 2023). The guiding template tends to be simpler or coarser compared to the more detailed verification template. This proposal aligns with the idea that attentional guidance during search might be based on only a subset of the information stored in the full template of the target.

Moreover, the fact that people can search for complex targets does not reveal whether visual signals from different feature dimensions contribute equally to the top-down guidance or whether signals along different dimensions are utilized differently during multidimensional search conditions. In fact, some theories have delved into this question (e.g., Bundesen, 1990; Gaspelin & Luck, 2018; Liesefeld et al., 2018; Wolfe, 1994). In the theory of visual attention, Bundesen (1990) proposed the mechanism of filtering, which refers to the prioritization of certain categories (e.g., red items) by assigning them a higher attentional weight, leading to an increased likelihood of objects belonging to these categories being selected for attention. The signal suppression hypothesis (Gaspelin & Luck, 2018; Sawaki & Luck, 2010) approaches the same idea from the other extreme: To minimize distractions from salient stimuli, features unique to those stimuli can be actively suppressed, ensuring they receive no attentional weights (also see Treisman & Sato, 1990). In both the theory of visual attention and the signal suppression hypothesis, changes in attentional weight apply to specific features (e.g., red, vertical).

The dimension weighting account addressed the attentional weighting process at a feature dimension level (e.g., Found & Müller, 1996; Müller et al., 2003; see also Liesefeld et al., 2018, for a review). Using an oddball search task, where the target identity varied randomly from trial to trial and differed from distractors along either color or orientation, Found and Müller (1996) found an intertrial facilitation: Search was faster when the target and distractors differed in the same dimension as the previous trial (e.g., trial N: color difference; trial N + 1: color difference), compared to when they differed in a different dimension (e.g., trial N: color difference; trial N + 1: orientation difference). This pattern suggests that a specific feature dimension can be upweighted in anticipation of its usefulness for distinguishing between the target and distractors, and further work suggests participants may have the ability to adjust the weight associated with any specific dimension (Müller et al., 2003).

Overall, a number of theories have tried to address whether there is a limitation on the complexity of a target template and whether specific features or feature dimensions of the target template can be up- or downweighted to facilitate the search process. However, none of those theories quantify any target dimensionality limitations or the strength of dimensional weighting. In other words, current theories have not attempted to quantitatively capture how visual signals from different feature dimensions simultaneously contribute to the top-down guidance and affect search behavior. In the current work, we used a behavioral-computational approach to study multidimensional searches by quantitatively estimating the contribution of visual information from three different feature dimensions (color, shape, and texture) in guiding these searches and making mathematical predictions on search performance.

<h31 id="xge-155-3-839-d543e353">The Present Study</h31>

We investigated how observers utilize information stored in the target template to guide attention during efficient visual searches in situations where the target differs from distractors along three feature dimensions—color, shape, and surface texture—referred to as tridimensional searches. Efficient search refers to situations where the features characterizing the target are sufficiently visually distinct from those defining the nontarget objects, such that peripheral vision is able to process multiple objects in parallel until it locates the target (e.g., Lleras et al., 2022). In the following tridimensional search experiments, the target template is defined by a color (e.g., red), a shape (e.g., octagon), and a texture (e.g., a cross). The distractors do not share any of these features with the target (e.g., see Figure 1, Experiments 4–12). Therefore, observers can rely on multiple visual signals from different feature dimensions to distinguish the target from distractors. It is currently unknown how the visual signal along three feature dimensions is integrated to guide the search process and if there are any constraints on this integration process. Indeed, it is possible that not all dimensions contribute equally to attentional guidance, or perhaps only a subset of the dimensions contributes to guidance.
>
><anchor name="fig1"></anchor>xge_155_3_839_fig1a.gif

Previous studies have adopted a behavioral-computational approach to examine feature integration in situations where the target differs from distractors along two feature dimensions: Color and shape were examined in Buetti et al. (2019) and Hughes et al. (2024), whereas shape and texture were studied in Xu, Lleras, and Buetti (2021).

In Buetti et al. (2019), search slopes were first measured in unidimensional search conditions, where people searched for a target that differed from distractors along one feature dimension, either color (e.g., searched for a red target among yellow distractors or among orange distractors) or shape (e.g., searched for a triangle target among semicircle distractors or among diamond distractors). Next, the search times in bidimensional search conditions were measured, where people searched for a target that differed from distractors along both color and shape dimensions (e.g., searched for a red triangle among yellow semicircles). The authors then used a model comparison approach to compare different mathematical models that varied in how they combined the search slopes from the unidimensional search conditions to predict the search times observed in the bidimensional searches. Buetti et al. found that information along color and shape dimensions is accumulated independently and is subsequently combined in a collinear fashion (Garner, 1974; Garner & Felfoldy, 1970), following a city-block metric (also see Pramod & Arun, 2014, 2016; but see Hughes et al., 2024, who found evidence supporting a Euclidean metric combination). Xu, Lleras, and Buetti (2021) found that shape and texture dimensions combine orthogonally, following a Euclidean distance metric (Garner, 1974; Garner & Felfoldy, 1970).

Note that while Buetti et al. (2019) offered an approach to better understand the mathematical rules governing the combination of color and shape in guiding attention, the models considered in the study assumed that color and shape signals are utilized equally. A similar assumption is made in Xu, Lleras, and Buetti (2021), where shape and texture were presumed to be utilized equally to guide attention during bidimensional searches.

Using the same approach, here, we evaluated performance in tridimensional search conditions, where the target differs from distractors along color, shape, and texture dimensions. Our focus was to examine whether signals from all these dimensions contribute to attentional guidance, how they are integrated, and importantly, if signals from different feature dimensions are utilized to varying degrees to determine the overall guidance. We conducted two sets of experiments. The first set used stimuli similar to those in previous bidimensional studies (Buetti et al., 2019; Xu, Lleras, & Buetti, 2021), featuring targets that were more distinctive from distractors on the color dimension compared to the other two dimensions. The second set used stimuli that are better controlled for target–distractor difference across the three feature dimensions. Importantly, these two sets of experiments allowed us to not only confirm the best performing models across different data sets but also investigate the extent to which the most effective models and their parameters vary with different stimuli.

Experimental Set 1


>

A total of 12 experiments were conducted in this set, each with a naïve group of participants. We first conducted three unidimensional search experiments to measure the search slopes in conditions where the target differed from distractors along color only (Experiment 1, color search), along shape only (Experiment 2, shape search), and along texture only (Experiment 3, texture search). Furthermore, we conducted nine tridimensional search experiments where the target differed from each type of distractors along all three dimensions: color, shape, and texture (Experiments 4–12; Figure 1). The methods and experimental protocols (IRB No. 05550: Attentional mechanisms in human vision) were approved by the Institutional Review Board at the University of Illinois, Urbana–Champaign, and are in accordance with the Declaration of Helsinki.

<h31 id="xge-155-3-839-d543e439">Method</h31>

<bold>Transparency and Openness</bold>

For Experiments 1–24, we report how we determined our sample size, all data exclusions, all manipulations, and all measures, following Journal Article Reporting Standards (Appelbaum et al., 2018). All data, analysis code, and research materials are available on the Open Science Framework (Xu, Lleras, & Buetti, 2024) at <a href="https://osf.io/bmwa4/" target="_blank">https://osf.io/bmwa4/</a>. Data were analyzed using R (Version 4.2.3) and Excel. Experiments 1–12’s design and analysis were preregistered (Xu et al., 2020) at <a href="https://osf.io/p5txf" target="_blank">https://osf.io/p5txf</a>. Experiments 13–24 were not preregistered, but they only differed from Experiments 1–12 in terms of the stimuli used.

<bold>Participants</bold>

Participants were recruited from either the University of Illinois at Urbana–Champaign or Prolific, in exchange for course credit or for money. Sample size was determined based on data simulation of the previous bidimensional search study (Xu, Lleras, & Buetti, 2021). We estimated the sample size required to produce a small standard error on reaction time (20.33 ms) and on the magnitude of the search slope estimate (3.17 ms/log unit) in the most variable condition (defined by a specific distractor type × set size) in that study. These simulations demonstrated that we would need to include 35 valid participants in each experiment (sample size rationale is detailed in our preregistration report at <a href="https://osf.io/p5txf" target="_blank">https://osf.io/p5txf</a>). A post hoc power analysis (see Appendix) showed satisfying model distinguishability starting at sample size of 7, and model parameter estimates stabilized around set size of 20.

For each experiment, four participant inclusion criteria were used: (1) Participants should complete all the trials (i.e., the experiment was not aborted before finishing), (2) participants should make a response on at least 85% of the trials, (3) search accuracy should be higher than 90%, and (4) an individual’s average response time (RT) should fall within 2 standard deviations of the group average RT. The accuracy rate was calculated as the percentage of trials where participants made a correct response divided by the total number of trials where participants made a response. That is, we excluded time-out trials (i.e., trials where participants did not make any response within 5 s; see below for a detailed experimental procedure) when computing the accuracy, as the experiments were conducted online, and it was impossible to ascertain the reason for time-outs. The number of recruited participants, the number of participants included in the analysis, the number of participants excluded for each criterion, included participants’ demographic information, and summary statistics of the measurements in each experiment are shown in Table 1.
>
><anchor name="tbl1"></anchor>xge_155_3_839_tbl1a.gif

<bold>Apparatus and Stimuli</bold>

All experiments were programmed in JavaScript and conducted on Pavlovia, with participants using their own computers. Because experiments were run online, we had no control over the visual angle of the stimuli on participants’ computers. To compensate for this, before the experiment, we asked participants to rescale an image of a credit card to match the real size of a credit card in order to ensure that stimuli across different computer displays maintained the same physical size (1.2 × 1.2 cm). Stimuli were randomly assigned to a location on the display with a small random jitter, based on two concentric circular grids occupying an area of 15 × 15 cm on the center of participants’ screens. The larger grid had a diameter of 13.8 cm, and the smaller grid had a diameter of 7.4 cm. This size was chosen to allow participants with screens as small as 12.5 in. to see the full search display.

The stimuli were shown on a white background. In any given trial, there was only one target and one type of distractor on the display. In other words, displays were always target-present and homogeneous in terms of distractors. The stimuli had a black square dot on either their left or their right, and the task was to report the location of the dot on the target stimulus. Stimuli used in Experiments 1–12 are shown in Figure 1.

Unidimensional Experiments


>

In Experiment 1 (color search), the target was a red octagon with a white cross texture inside (Figure 1). Distractors shared the shape (octagon) and texture (cross) with the target, but their color was either orange, green, or pink. In Experiment 2 (shape search), the target was a gray octagon with a white cross texture. Distractors shared the color (gray) and texture (cross) with the target, but their shape was either a triangle, a house, or a square. In Experiment 3 (texture search), the target was a gray octagon with a white cross texture. Distractors shared the same color (gray) and shape (octagon) with the target, but their texture was made of either a dot, lines forming a tilted pound key, or a solid gray texture.

Tridimensional Experiments


>

In Experiments 4–12, the target was always a red octagon with a white cross texture inside (the same as Experiment 1) and the distractors were constructed by combining all the distractor colors, shapes, and textures used in Experiments 1–3. There were in total 27 types of tridimensional distractors (i.e., 3 colors × 3 shapes × 3 textures). To keep the study design consistent across all experiments, we only tested three types of distractors in each experiment. Therefore, we divided the tridimensional distractors into nine experimental sessions, each containing three types of distractors. Only one type of distractor was presented along with the target on a given trial.

<bold>Design</bold>

In each experiment, participants searched for the target among one of three types of distractors (e.g., in Experiment 1, participants searched for the red target among orange, pink, or green distractors). We also included a target-only condition where no distractors were presented. For each type of distractors, there were four distractor set sizes: 1, 4, 9, and 19. In total, each experiment contained 13 conditions that were repeated 48 times, summing up to a total of 624 trials. Sample displays are shown in Figure 2.
>
><anchor name="fig2"></anchor>xge_155_3_839_fig2a.gif

<bold>Procedure</bold>

Each trial began with a black cross appearing for 0.5 s at the center of the screen over a white background. A search display followed. Participants were asked to search for the target among distractors and report whether the black square dot was on the left or right side of the target by pressing the corresponding left or right arrow key on the keyboard. The search display remained on the screen for 5 s or until a response was made by the participants, whichever occurred first. Visual feedback (“Correct!” or “Wrong!”) was provided after each trial, lasting for 0.5 s. The trial then ended with a white background displayed for an interval of 0.5 s.

<bold>Behavioral–Computational Predictive Approach</bold>

When observers search for a known target among sufficiently different distractor items, processing occurs in parallel and simultaneously at all item locations, within a sufficiently large functional viewing field (Hulleman & Olivers, 2017). In the present study, we expected participants would perform a parallel search over the whole search display because we used similar stimuli as in previous studies where parallel search was obtained (see Buetti et al., 2019; Xu, Lleras, & Buetti, 2021). Such parallel processing is considered to be unlimited capacity, with search items being processed in an independent and exhaustive manner (Buetti et al., 2016). At each location, a contrast signal between the target template and the search item is computed. This contrast signal accumulates stochastically until reaching a rejection threshold, indicating that the item is no longer considered as a potential target (Buetti et al., 2016; Lleras et al., 2020; Townsend & Gregory Ashby, 1983). Items that are not rejected during this parallel processing stage are then scrutinized serially until the target is identified. In easy searches, when the target is surrounded by only one type of sufficiently different distractors, the only item that survives the parallel stage is typically the target (e.g., Buetti et al., 2016, 2019; Xu, Lleras, & Buetti, 2021).

The stochastic contrast accumulation that happens in parallel at all locations across the whole display produces a signature logarithmic increase in RT as a function of set size (Buetti et al., 2016; Lleras et al., 2020; Townsend & Gregory Ashby, 1983). The logarithmic slope LS indexes the time required for one single distractor to be rejected, which is influenced by the visual distinctiveness of the target in relation to the distractor. This distinctiveness term refers to a target–distractor perceptual difference in a top-down fashion—that is, a computation of how perceptually different the target is from the distractor. This distinctiveness is different from the concept of purely bottom-up contrast, which is a computation of how perceptually different an element in the scene is from its immediate surroundings (i.e., the background). The more distinctive the target, meaning the more dissimilar the target and distractors are, the shorter the time needed for this type of distractor to reach the rejection threshold, and the shallower the slope LS will be. Target contrast signal theory proposed that the steepness of the logarithmic slope LS is inversely proportional to the overall top-down contrast/distinctiveness signal C being accumulated for a given target–distractor pair (Equation 1; Lleras et al., 2020):<anchor name="eqn1"></anchor>xge_155_3_839_eqn1a.gifwith LS being the logarithmic slope and α being a multiplicative constant factor (Lleras et al., 2020).

In the present study, we estimated the logarithmic search slopes for each of the three target–distractor color pairs (Experiment 1, unidimensional color search), three shape pairs (Experiment 2, unidimensional shape search), and three texture pairs (Experiment 3, unidimensional texture search). We then used these logarithmic slope values to predict the tridimensional search slopes LSc,s,t, where the target and distractors differed along color, shape, and texture. We considered 10 different predictive models (discussed in the Models Retained for Model Comparison section), each based on a unique assumption about how the contrast signals along the three dimensions combine to guide attention in the tridimensional search conditions. That is, for each model, we computed the predicted search RTs for each condition of a specific distractor type and a set size level, using Equation 2 (Lleras et al., 2020):<anchor name="eqn2"></anchor>xge_155_3_839_eqn2a.gifwhere RT0 represents the RT in the target-only condition and LSc,s,t represents the predicted search slope in tridimensional search. The final term is the natural logarithm of the total set size, including all distractors plus the target.

Next, we compared the predicted RTs with the observed RTs across all distractor type by set size conditions in Experiments 4–12 (tridimensional search) by regressing the observed RTs onto the predicted RTs (Equation 3):<anchor name="eqn3"></anchor>xge_155_3_839_eqn3a.gifwhere a and b are free parameters in the simple linear regression.

In sum, for each tested model, we computed a set of predicted logarithmic slope LSc,s,t values, which, in turn, allowed us to compute a set of predicted RTs (RTpredicted) for all the conditions run in Experiments 4–12, using Equation 2. These predicted RTs were then compared to the observed RTs (using Equation 3) to determine how well each model predicts the tridimensional search performance. Overall, there were 108 mean RTs (27 tridimensional distractor types × 4 set size levels) predicted by each model. The validity of the 10 models was compared based on their R<sups>2</sups>, Akaike information criterion (AIC) values, and AIC comparison likelihoods.

In addition to utilizing our behavioral-computational predictive approach, we also applied a Minkowski r-metric model (Equation 4), which is a comparison model aimed at validating the performance of the 10 tested models. The Minkowski r-metric model contains the parameter of r, which captures the extent to which the leading feature dimension (i.e., the one providing the largest contrast signals) is prioritized (e.g., Nosofsky, 1986). A Minkowski’s r close to 1 indicates an equal contribution among the three dimensions, reflecting that the magnitude of the contrast signals from each feature dimension is simply added to each other to determine the overall guidance, with no feature dimension receiving more weight than the others. In other words, the information carried by each dimension is considered equally by the system when determining overall guidance. Conversely, larger r values imply that there is a more informative dimension that receives greater weight compared to the less informative dimensions when combining their respective contrast signals for overall guidance. This is because r values larger than 1 effectively exaggerate the impact of the dimension with the largest contrast value, relative to those with smaller contrast values. This model serves as a benchmark for the 10 models tested in this study, validating the level of disproportion each model hypothesizes across the three feature dimensions:<anchor name="eqn4"></anchor>xge_155_3_839_eqn4a.gifwhich equals:<anchor name="eqn5"></anchor>xge_155_3_839_eqn5a.gif

The optimal value of Minkowski’s r was established through the following way: We simulated tridimensional search RTs for the Minkowski r-metric model at each r value, sampled from 0.1 to 30 at an interval of 0.1 (as reported in Appendix Table A1). We then compared the predicted tridimensional search RTs at each r value with the observed times. The r value yielding the highest R<sups>2</sups> (.8606) was 13.3, and it was chosen as the optimal parameter for the Minkowski r-metric model in the first set of experiments (see Appendix Figure A1, Left for the variation of R<sups>2</sups> as a function of r value).

<bold>Models Retained for Model Comparison</bold>

Models Relying on Guidance From One Feature Dimension


>

The models in this category are based on proposals from previous literature that suggest only a single feature dimension leads the multidimensional search (e.g., Alexander et al., 2019; Williams, 1967).

Model 1: Color-Only Model


>

This model assumes that the overall contrast signal in a tridimensional search is determined exclusively by the contrast in the color dimension (Equation 6). This model is rooted in a long-held belief that color is such a distinct feature dimension that it can overshadow other feature dimensions when they are presented together. This model is consistent with the findings of Williams (1967), who observed that when the target was characterized by color, shape, and size, it was primarily the color dimension that guided visual fixations:<anchor name="eqn6"></anchor>xge_155_3_839_eqn6a.gifwhich in turn means:<anchor name="eqn7"></anchor>xge_155_3_839_eqn7a.gif

Here, LSc,s,t represents the predicted slope for distractors that differ from the target in color, shape, and texture. LScolor refers to the slope observed in Experiment 1, where distractors differed in color from the target.

Model 2: Best Feature Guidance Model


>

This model assumes that performance is determined solely by the largest contrast signal among the three relevant dimensions, with the other two signals being ignored (Equation 8). It is equivalent to saying that observers identify the feature dimension that most effectively differentiates items in the scene and concentrate exclusively on it to reject distractors. In other words, the search slope in the tridimensional search task is the same as the search slope of the most efficient unidimensional condition (Equation 9).

This model is conceptually similar to Guided Search 2.0 (Wolfe, 1994), which posits that within a specific feature dimension (e.g., color), a broadly tuned channel (e.g., red) that most effectively distinguishes the target from distractors is selected to accumulate activation in that feature map (e.g., color map). However, while Guided Search assumes that such selection occurs at the feature level, the best feature guidance model posits that it happens at the dimension level. This model also aligns with the finding in Williams (1967) that when the target was defined by both color and size, and the target size was at the largest level, then size could guide attention. This suggests that the feature dimension determining the overall guidance varies based on the utility of the available feature dimensions, rather than being fixed to a specific one:<anchor name="eqn8"></anchor>xge_155_3_839_eqn8a.gifwhich translates to<anchor name="eqn9"></anchor>xge_155_3_839_eqn9a.gif

LSshape and LStexture represent the slopes where distractors have different shapes (observed in Experiment 2) or textures (observed in Experiment 3) than the target.

Note that Equation 9 is applied to each type of distractor in each of Experiments 4–12. This means that the winning feature dimension is determined independently for each type of distractor, rather than being fixed across different distractor types within or across different experiments.

Models Including Unweighted Color, Shape, and Texture


>

The models in this category assume that all relevant feature dimensions contribute signals in tridimensional searches, which is consistent with a number of visual search theories (e.g., Bundesen, 1990; Wolfe, 2021; Zelinsky, 2008).

Model 3: Three-Way Orthogonal Combination Model


>

This model assumes that contrast signals along color, shape, and texture combine orthogonally to form the overall contrast signal (Equation 10). This type of integration was shown previously in Xu, Lleras, and Buetti (2021), where the authors found that the overall contrast between the target and distractors that differ along both shape and texture was determined by the orthogonal sum of the two unidimensional contrast vectors. We hypothesized that color contrast would add to the overall contrast in the same orthogonal fashion:<anchor name="eqn10"></anchor>xge_155_3_839_eqn10a.gifwhich translates to<anchor name="eqn11"></anchor>xge_155_3_839_eqn11a.gif

Model 4: Three-Way Collinear Integration Model


>

This model assumes that contrast signals along color, shape, and texture combine collinearly to form the overall contrast signals (Equation 12). This type of integration was shown previously in Buetti et al. (2019), where the authors found that the overall contrast between the target and distractors that differ along both color and shape was determined by the collinear sum of the two unidimensional contrast vectors. We hypothesized that texture contrast would add to the overall contrast in the same collinear fashion:<anchor name="eqn12"></anchor>xge_155_3_839_eqn12a.gifwhich translates to<anchor name="eqn13"></anchor>xge_155_3_839_eqn13a.gif

Model 5: Color Collinear–Shape/Texture Orthogonal Integration Model


>

This model is based on both findings that color and shape contrast signals combine collinearly (Buetti et al., 2019), while texture and shape contrast signals combine orthogonally (Xu, Lleras, & Buetti, 2021). These results led to the hypothesis that, given that texture and shape are integral features, the texture and shape contrast signals would combine orthogonally (i.e., following a Euclidean metric, Garner, 1974), and that color would simply add to this (i.e., city-block metric, Garner, 1974) to form the overall contrast (Equation 14) because color is separable from shape. The name of this model emphasizes that color is collinearly added to the orthogonal combination of shape and texture in the final step:<anchor name="eqn14"></anchor>xge_155_3_839_eqn14a.gifwhich translates to<anchor name="eqn15"></anchor>xge_155_3_839_eqn15a.gif

Model 6: Texture Orthogonal–Color/Shape Collinear Combination Model


>

This model is a variation of Model 5. Specifically, it assumes that color and shape contrast signals first combine collinearly, and then the texture contrast is orthogonally combined with this collinear sum (see Equation 16). The name of this model emphasizes that texture is orthogonally combined with the collinear sum of color and shape contrasts as the final step:<anchor name="eqn16"></anchor>xge_155_3_839_eqn16a.gifwhich solves into<anchor name="eqn17"></anchor>xge_155_3_839_eqn17a.gif

Models Including Weighted Color, Shape, and Texture


>

In Models 3–6, we assumed that the contrast signals along the three dimensions are equally utilized in forming the overall guidance. However, the extent to which people utilize each dimension might not be uniform (e.g., Bundesen, 1990; Gaspelin & Luck, 2018; Liesefeld et al., 2018; Wolfe, 1994). This idea was computationally explored in Xu, Lleras, Gong, and Buetti (2024), wherein the authors utilized the paradigm introduced by Buetti et al. (2019), using the search slopes in unidimensional color searches and shape searches to predict search performance in bidimensional color and shape searches. The critical manipulation in Xu, Lleras, Gong, and Buetti was to introduce an instruction manipulation in the bidimensional searches: One group of participants was instructed to search for the target color, while another group was instructed to search for the target shape, and last, a final group was instructed to search for the target defined by both color and shape. The results showed that the manipulation of which feature dimension the participants were focused on was captured by corresponding changes in that dimension’s weight parameter. These findings suggested that observers might be able to allocate varying degrees of attentional priority to different feature dimensions as a function of the experimental conditions. In the context of the present study, the notion of the attentional priority or attentional weight becomes relevant when one dimension provides a larger contrast signal than the others, making it more informative, or when one dimension is naturally preferred by human visual systems (e.g., color; see Alexander et al., 2019; Williams, 1967). In such cases, there might be an imbalance in the attentional weight placed on different feature dimensions, influencing the extent to which people utilize contrast signals along each dimension to guide their attention.

Therefore, for the four models that incorporate signals from all three dimensions, we considered a variation where an attentional weight parameter was added to each dimension.

Model 7: Weighted Three-Way Orthogonal Combination Model


>

This model introduces an attentional weight parameter to each of the color, shape, and texture components (Equation 18), building upon the original three-way orthogonal combination model (Model 3). The sum of the three weights is constrained to equal 3, ensuring comparability with the original model:<anchor name="eqn18"></anchor>xge_155_3_839_eqn18a.gifwhich solves into<anchor name="eqn19"></anchor>xge_155_3_839_eqn19a.gifwith the constraint that<anchor name="eqn20"></anchor>xge_155_3_839_eqn20a.gif

Model 8: Weighted Three-Way Collinear Integration Model


>

This model also introduces an attentional weight parameter for each dimension (Equation 21), following the framework of the original three-way collinear integration model (Model 4), while maintaining the constraint that the sum of the three weights equals 3:<anchor name="eqn21"></anchor>xge_155_3_839_eqn21a.gifwhich solves into<anchor name="eqn22"></anchor>xge_155_3_839_eqn22a.gif

Model 9: Weighted Color Collinear–Shape/Texture Orthogonal Integration Model


>

This model introduces weight parameters to both the color component and the combined shape and texture component (Equation 23), based on the original color collinear–shape/texture orthogonal integration model (Model 5). The sum of the two weights should be 2, maintaining consistency with the original model:<anchor name="eqn23"></anchor>xge_155_3_839_eqn23a.gifwhich solves into<anchor name="eqn24"></anchor>xge_155_3_839_eqn24a.gifwith the constraint that<anchor name="eqn25"></anchor>xge_155_3_839_eqn25a.gif

Model 10: Weighted Texture Orthogonal–Color/Shape Collinear Combination Model


>

This model adds weight parameters to both the combined color and shape component and the texture component (Equation 26), based on the original texture orthogonal–color/shape collinear combination model (Model 6). Similarly, the sum of these two weights should be 2:<anchor name="eqn26"></anchor>xge_155_3_839_eqn26a.gifwhich solves into<anchor name="eqn27"></anchor>xge_155_3_839_eqn27a.gifwith the constraint that<anchor name="eqn28"></anchor>xge_155_3_839_eqn28a.gif

Estimation of the Attentional Weight w


>

For all the weighted models, the optimal values for the attentional weight w associated with different feature dimensions were established prior to comparing the models’ performance and prediction accuracy. Specifically, we simulated the tridimensional search RTs for each of the weighted models using various w values within a specified range (reported in Appendix Table A1). By regressing the observed tridimensional RTs onto the predicted RTs using Equation 3, the w values that yielded the highest R<sups>2</sups> were selected as the optimal parameters fixed for each model (see Appendix Figure A2). This procedure allows all models to have the same two free parameters, namely, slope and intercept, when evaluating model performance using Equation 3. This put 10 tested models on the same level when comparing their performance based on AIC values. It should be noted that we also performed a split-half predictive analysis to validate the robustness of the w value for the winning model, as described in the Optimal Weight Stability Analysis section of the Results section. The results of that analysis demonstrated that the weights estimated on half of the data robustly predict performance on the other half of the data.

<h31 id="xge-155-3-839-d543e1960">Results</h31>

<bold>Search Slopes Observed in Unidimensional Search Experiments</bold>

Figure 3 shows the changes in search times as a function of the stimulus set size for each target–distractor pair in Experiments 1–3. Table 2 summarizes the logarithmic slopes observed in these experiments.
>
><anchor name="fig3"></anchor>xge_155_3_839_fig3a.gif
>
><anchor name="tbl2"></anchor>xge_155_3_839_tbl2a.gif

<bold>Search Slopes Observed in Tridimensional Search Experiments</bold>

Table 3 summarizes the logarithmic search slopes observed in Experiments 4–12.
>
><anchor name="tbl3"></anchor>xge_155_3_839_tbl3a.gif

<bold>Model Comparison</bold>

Table 4 shows the performance of the 10 models tested in the study, along with the Minkowski r-metric model, which serves as a comparison model. Table 4 includes the model expressions, R<sups>2</sups>, AIC values, and AIC comparison likelihoods for all other models against the winning model.
>
><anchor name="tbl4"></anchor>xge_155_3_839_tbl4a.gif

Single Dimension Models


>

Among the models relying on one feature dimension, Model 2 (best feature guidance model; R<sups>2</sups> = 85.81%, AIC = 842.65) performed the best, being 67 times more likely than Model 1 (color-only model; R<sups>2</sups> = 84.66%, AIC = 851.05) in explaining the variability in the observed data. Although this relatively simple best feature guidance model outperformed several more complex models, including Model 10 (weighted texture orthogonal–color/shape collinear model; 12 times less likely) and all the unweighted models (Models 3–6 ranging from 12 to 7.8 × 10<sups>5</sups> times less likely), it was considerably less likely than the three winning weighted models: Model 9 (weighted color collinear–shape/texture orthogonal integration model; R<sups>2</sups> = 90.76%, AIC = 796.31), Model 7 (weighted three-way orthogonal model; R<sups>2</sups> = 90.06%, AIC = 804.23), and Model 8 (weighted three-way collinear model; R<sups>2</sups> = 89.68%, AIC = 808.26), to account for the observed data (1.2 × 10<sups>10</sups>, 2.2 × 10<sups>8</sups>, and 2.9 × 10<sups>7</sups> times less likely, respectively). Therefore, we can conclude that it is unlikely that participants relied solely on a single dimension to guide search; instead, it is more likely that they integrated information across multiple feature dimensions.

Weighted Versus Unweighted Models


>

Adding weight terms substantially increases the model fit. Specifically, Model 7 (weighted three-way orthogonal model), Model 8 (weighted three-way collinear model), Model 9 (weighted color collinear–shape/texture orthogonal integration model), and Model 10 (weighted texture orthogonal–color/shape collinear combination model) were 8.8 × 10<sups>10</sups>, 2.3 × 10<sups>13</sups>, 1.4 × 10<sups>11</sups>, and 5 times more likely, respectively, than their corresponding unweighted models (Models 3–6) in accounting for the variability in the observed data. We can conclude that, although participants were integrating information across the three feature dimensions, not all information contributed equally to guidance. The results consistently suggested that participants weighed the information coming from the color dimension more heavily than the one coming from the shape and texture dimensions (Model 7, weighted three-way orthogonal model: wcolor = 2.3, wshape = 0.35, and wtexture = 0.35; Model 8, weighted three-way collinear model: wcolor = 2.4, wshape = 0.25, and wtexture = 0.35; Model 9, weighted color collinear–shape/texture orthogonal model: wcolor = 1.7 and wshape&texture = 0.3; and Model 10, weighted texture orthogonal–color/shape collinear model: wcolor&shape = 1.3 and wtexture = 0.7). Note that the maximal weight a feature dimension could receive was 3 in the former two models and 2 in the latter two models.

Optimal Minkowski’s r


>

The Minkowski r-metric model acts as a confirmatory model alongside the 10 models tested in the study. Unlike the other models, the Minkowski r-metric model does not define a specific type of combination among the three dimensions. For instance, when r equals 1, the three dimensions combine collinearly, corresponding to Model 4 (three-way collinear integration model). When r equals 2, the three dimensions combine orthogonally, resembling Model 3 (three-way orthogonal combination model). The specific type of combination rule in the Minkowski r-metric model depends on the specific data sets; thus, it is not a predetermined model like the 10 models tested. However, Minkowski’s r effectively captures the extent to which the leading dimension drives the overall guidance. Indeed, as r increases toward infinity, the Minkowski r-metric approaches Model 2 (best guidance model), where the visual dimension with the overall larger contrasts contributes the most to guidance.

In the current set of experiments, the optimal r that yielded the largest R<sups>2</sups> for the Minkowski r-metric model was 13.3, indicating that the leading dimension received disproportionately greater importance compared to the other feature dimensions. This result aligns with the observed model performance: All of the weighted models outperformed the corresponding unweighted counterparts, indicating an imbalance across the three feature dimensions in terms of prioritization during tridimensional search. Furthermore, across all weighted models, the weights associated with the color dimension were much larger than those for the other dimensions, confirming color’s predominance in this set of experiments.

The Winning Model


>

In comparing the four weighted models, the R<sups>2</sups> of Model 9 (weighted color collinear–shape/texture orthogonal integration model) was the highest (which was also the highest among the 10 models tested). Specifically, Model 9 (R<sups>2</sups> = 90.76%, AIC = 796.31) was found to be 53 times more likely than Model 7 (weighted three-way orthogonal combination model; R<sups>2</sups> = 90.06%, AIC = 804.23), 393 times more likely than Model 8 (weighted three-way collinear integration model; R<sups>2</sups> = 89.86%, AIC = 808.25), and 1.2 × 10<sups>10</sups> times more likely than Model 10 (weighted texture orthogonal–color/shape collinear combination model; R<sups>2</sups> = 85.14%, AIC = 847.62) in explaining the observed data. However, as discussed in the Additional Analyses section, later bootstrapping analyses suggested that Model 7, the weighted three-way orthogonal combination model, might be more universal and the overall winning model.

To visualize the model performance, Figure 4 displays the observed RTs from Experiment 4–12 as a function of the predicted RTs for the four weighted models. These models were constructed based on the most current understanding of how color, shape, and texture combine—that is, color and shape are presumed to combine collinearly, and shape and texture orthogonally, with room for attentional modulation. Note that in each figure, there are 108 RTs being predicted (i.e., four set size levels by 27 tridimensional distractor types).
>
><anchor name="fig4"></anchor>xge_155_3_839_fig4a.gif

<bold>Optimal Weight Stability Analysis</bold>

The weighted models (Models 7–10) all contained attentional weight parameters that were estimated on the entire data set. When comparing the weighted models to unweighted models, the implicit assumption is that weight parameters are not free parameters, but rather a characteristic of the data set that is inherent to the condition, and thus should not be counted as an additional parameter in the model. To test the validity of this assumption, we performed a split-half analysis to validate the optimal weights.

For this analysis, we estimated optimal weights of the two top-performing weighted models (Models 7 and 9) using a training data set made up of half of the total data set and assessed their predictive accuracy on the remaining half of the data (the testing set). We began by using the complete data set from the unidimensional experiments (i.e., Experiments 1–3) to estimate unidimensional search slopes. Next, we randomly sampled half of the tridimensional trials to determine optimal weight parameters during tridimensional searches and constructed the two top-performing weighted models (Models 7 and 9). We then tested these models on the remaining half of the tridimensional trials. This process was repeated 100 times to arrive at the parameter estimates and model performance presented in Table 5 and Figure 5. Results showed that both weighted models, constructed based on the training sets, successfully predicted data in the testing sets (R<sups>2</sups> column in Table 5). These results suggest that the weight parameters are stable estimates of the prioritization people place on different feature dimensions. Additionally, the split-half testing results suggest that Model 7 might be a slightly better model. This was later confirmed in the bootstrapping and post hoc power analysis results; see the Additional Analyses section and Appendix for details.
>
><anchor name="tbl5"></anchor>xge_155_3_839_tbl5a.gif
>
><anchor name="fig5"></anchor>xge_155_3_839_fig5a.gif

<h31 id="xge-155-3-839-d543e2150">Discussion</h31>

Results from Experiments 1–12 demonstrated that during tridimensional searches, people likely incorporate information from all the feature dimensions to guide their attention, as the top three performing models all incorporate the contrast signals from all three feature dimensions. Notably, while all three dimensions contribute to search performance, there is a tendency to prioritize signals from the color dimension. This is evidenced by the large optimal weights associated with the color dimension across all weighted models. The optimal Minkowski’s r value of 13.3 further confirms that the visual dimension with the largest distinctiveness signals (in this case, color) was overrepresented compared to the other two dimensions in determining the overall guidance. Indeed, examining the stimuli and search performance in the unidimensional search experiments confirms the leading role of color. The target–distractor distinctiveness along the color dimension (producing the smallest set of search slope values: 25–58 ms/log unit in Experiment 1) was larger compared to the distinctiveness along the shape (44–85 ms/log unit in Experiment 2) and texture (38–101 ms/log unit in Experiment 3) dimensions. Overall, color was the most useful feature dimension for guiding tridimensional searches in these experiments.

The fact that color distinctiveness produced more efficient searches than shape and texture might explain why the best feature guidance model outperformed five out of eight models that consider signals from all three dimensions, including all the unweighted tridimensional models and the weighted texture orthogonal–color/shape collinear model. It could also explain why Model 1 (color-only model) outperformed the unweighted Model 3 (three-way orthogonal model) and Model 4 (three-way collinear model). Given that color in this set of experiments tended to have overall stronger guiding signals than shape or texture, models prioritizing the contribution of color had an advantage in explaining the data, regardless of whether the model captured the underlying structure of how signals combine across different dimensions.

Next, we completed a second set of experiments where the contrast signals along the color, shape, and texture dimensions had comparable ranges. The goal was to evaluate whether people would still preferentially allocate a higher attentional weight to the color dimension or whether the weights reflect the usefulness of a feature dimension, in which case, we would expect similar weight parameters for all three dimensions.

Experimental Set 2


>

A set of 12 experiments were conducted using the same paradigm as before, each with a naïve group of participants, but using a new set of feature parameters with more comparable distinctiveness signals across the three visual dimensions and where color no longer had a guidance advantage over shape and texture.<anchor name="b-fn1"></anchor><sups>1</sups> All data, analysis code, and research materials are available on the Open Science Framework (Xu, Lleras, & Buetti, 2024) at <a href="https://osf.io/bmwa4/" target="_blank">https://osf.io/bmwa4/</a>.

<h31 id="xge-155-3-839-d543e2173">Method</h31>

<bold>Participants</bold>

Undergraduate students from the University of Illinois at Urbana–Champaign completed the experiment in exchange for course credit. Since this experimental set was conducted in person, the sample size was determined based on previous in-person experiments in our lab, which showed that averaging the data of 20 subjects produces stable estimates of the group means of reaction times and search slopes for a given search condition (e.g., Buetti et al., 2016; Madison et al., 2018; Ng et al., 2018; Wang et al., 2018) and is sufficient to obtain differentiation between models (e.g., Buetti et al., 2019; Lleras et al., 2019; Wang et al., 2017; Xu, Lleras, & Buetti, 2021; Xu, Lleras, Shao, & Buetti, 2021). Similar to Set 1, a post hoc power analysis (see the Appendix) showed that satisfying model distinguishability started around a sample size of 20, and weight parameters stabilized around a sample size of 10 for Set 2. We aimed to include 25 valid participants in each in-person experiment, but because of the nature of data collection, sometimes more participants ended up being run (e.g., participants had already signed up to participate in the experiment).

For each experiment, participant inclusion criteria were the same as Set 1 except for the following: (a) Search accuracy rate was now calculated as the percentage of trials in which participants made a correct response divided by the total number of trials (i.e., we did not remove time-out trials before calculating accuracy), and (b) an individual’s average RT should fall within 2.5 standard deviations of the group average RT, instead of 2 standard deviations in Set 1. The number of recruited participants, the number of participants included in the analysis, the number of participants excluded for each criterion, demographic information of included participants, and summary statistics of the measurements in each experiment are reported in Table 6.
>
><anchor name="tbl6"></anchor>xge_155_3_839_tbl6a.gif

<bold>Apparatus and Stimuli</bold>

Experiments were again programmed in JavaScript and conducted on Pavlovia. Participants used Mac minis and gamma-corrected 24-in. LCD displays with a 239.75-Hz refresh rate and a resolution of 1,920 × 1,080 in the lab to complete them. Stimuli were randomly displayed on a grid located at the center of the screen, spanning approximately 27 × 27 cm (approximately 25° of visual angle, with a viewing distance of 60 cm). The stimuli, measuring roughly 1.4 × 1.4 cm in physical size (1.3° × 1.3° of visual angle), were positioned with a small random jitter based on three concentric circular grids, measuring about 25.5 cm, 13.7 cm, and 7.4 cm in diameter, respectively. Examples of stimuli are shown in Figure 6.
>
><anchor name="fig6"></anchor>xge_155_3_839_fig6a.gif

Unidimensional Experiments


>

In Experiments 13–15, the target was always a red (L = 54, a = 67, b = 38) octagon with a white cross texture inside. In Experiment 13 (color search), distractors shared the shape (octagon) and texture (cross) with the target but differed in color. They were either 30 (L = 54, a = 45, b = 53), 40 (L = 54, a = 36, b = 56), or 50 (L = 54, a = 27, b = 57) degrees away from the target on an iso-lightness circle in the CIELAB color space (Schurgin et al., 2020; Zhang & Luck, 2008), centered at (L = 54, a = 21.5, b = 11.5) with an average radius of 50.5 (see Figure 6, Panel A). In Experiment 14 (shape search), distractors shared the color (red) and texture (cross) with the target but differed in shape, being either a triangle, house, or square. In Experiment 15 (texture search), distractors shared the same color (red) and shape (octagon) with the target, but their texture varied, featuring either lines forming a tilted pound key, dots, or a solid red texture. Note that in our research, we use observed search slopes in unidimensional searches as a useful operational definition of “perceived similarity” between target and distractor features in that dimension (see Lleras et al., 2025, for a demonstration that perceived similarity in the color dimension perfectly determines search efficiency in that dimension). In other words, matching search performance in terms of search slopes is our operational approach to matching the perceived similarity of target–distractor feature pairs across the three feature dimensions (Table 7).
>
><anchor name="tbl7"></anchor>xge_155_3_839_tbl7a.gif

Tridimensional Experiments


>

In Experiments 16–24, the target was the same as in Experiments 13–15 (a red octagon with a white cross texture inside), and the distractors were constructed by combining all the distractor colors, shapes, and textures used in Experiments 13–15. All the rest were the same as in the first set of experiments.

<bold>Design</bold>

The design and procedure were the same as in the first set of experiments, except that for this set of experiments, for each type of distractors, there were five distractor set sizes: 2, 4, 9, 19, and 31. We also included a target-only condition where no distractors were presented. In total, each experiment contained 16 conditions that were repeated 40 times, summing up to a total of 640 trials.

<bold>Behavioral–Computational Predictive Approach</bold>

The same predictive approach as used in Experimental Set 1 was adopted for this set of experiments. The optimal value of Minkowski’s r was 1.5 with an R<sups>2</sups> of 0.8564 (see Figure A1, Right for the simulation result and Table A2 in the Appendix). The optimal values for the attentional weight w associated with different feature dimensions in the weighted models were established prior to comparing the models’ performance and prediction accuracy (see Figure A3 and Table A2 in the Appendix).

<h31 id="xge-155-3-839-d543e2309">Results</h31>

<bold>Search Slopes Observed in Unidimensional Search Experiments</bold>

Figure 7 shows how the search times changed as a function of the stimulus set size for each target–distractor pair in Experiments 13–15. Table 7 summarizes the logarithmic slopes observed in these experiments.<anchor name="b-fn2"></anchor><sups>2</sups>
>
><anchor name="fig7"></anchor>xge_155_3_839_fig7a.gif

<bold>Search Slopes Observed in Tridimensional Search Experiments</bold>

Table 8 summarizes the logarithmic search slopes observed in Experiments 16–24.
>
><anchor name="tbl8"></anchor>xge_155_3_839_tbl8a.gif

<bold>Model Comparison</bold>

Table 9 shows the performance of the 10 models tested for this set of experiments and the Minkowski r-metric model, which serves as a comparison model.
>
><anchor name="tbl9"></anchor>xge_155_3_839_tbl9a.gif

Single Dimension Models


>

Model 2 (best feature guidance model) and Model 1 (color-only model) exhibited poorer performance compared to the other eight models that combine contrast signals from all three dimensions. This result aligns with what we found in the first set of experiments, indicating that during tridimensional searches, people likely incorporate signals from more than one dimension to guide their search. This finding contrasts with the theories solely relying on color, as suggested by Williams (1967), or those suggesting the use of the most informative dimension to guide the search.

Weighted Versus Unweighted Models


>

The results show that adding the weight terms increases the model fit. Specifically, weighted Models 7–10 were 4.6 × 10<sups>12</sups>, 1.4 × 10<sups>4</sups>, 2.3 × 10<sups>4</sups>, and 1.3 times more likely, respectively, than their corresponding unweighted models in explaining the observed data. Similar to the first data set (Experiments 1–12), participants favored information from the color dimension, as evidenced by a systematic bias in the dimensional weights: Across all the weighted models, the weight assigned to color exceeds one (see Table 9). This trend is particularly evident in the winning weighted three-way orthogonal combination model (Model 7), where the color weight (wcolor = 1.75) is 2 to 4 times larger than the weights for shape (wshape = 0.85) and texture (wtexture = 0.4), respectively.

The Winning Model


>

The winning model was Model 7 (weighted three-way orthogonal model), which was nearly 30,000 times more likely than the second best performing Model 9 (weighted color collinear–shape/texture orthogonal model; and even more for the rest of the models) in explaining the observed data (see the full AIC comparison results in Table 9).

To visualize model performance, Figure 8 displays the observed RTs from Experiments 16–24 as a function of the predicted RTs for the four weighted models. Note that in each figure, there are 135 RTs being predicted (i.e., 5 set size levels by 27 tridimensional distractor types).
>
><anchor name="fig8"></anchor>xge_155_3_839_fig8a.gif

<bold>Optimal Weight Stability Analysis</bold>

We performed a split-half analysis to validate the optimal weights in Experimental Set 2 using the same procedure as in Set 1. As is shown in Table 10, both weighted models constructed based on the training set successfully predicted performance on the testing set, achieving almost identical R<sups>2</sups> modeling the training and testing data. The weight parameters are stable estimates of the prioritization people place on different feature dimensions (see Figure 9).
>
><anchor name="tbl10"></anchor>xge_155_3_839_tbl10a.gif
>
><anchor name="fig9"></anchor>xge_155_3_839_fig9a.gif

<h31 id="xge-155-3-839-d543e2397">Discussion</h31>

In the second set of experiments, we adjusted the distinctiveness signals along the color dimension so that it would not have a guidance advantage over shape and texture, as it did in Experimental Set 1. This adjustment led to several differences in people’s search performance compared to the first set of experiments.

<bold>Optimal Attentional Weights</bold>

A quick visual comparison between Tables 4 and 9 reveals that, across all weighted models, the color (or color and shape integrated) weights were always substantially larger in the first set of experiments compared to the second set. These results suggest that, when the target–distractor distinctiveness across the three feature dimensions was more comparable, participants placed relatively less priority on the color dimension. However, observers still continued to prioritize color relative to the other dimensions (wcolor &gt; wshape and wtexture; see Table 9).

We also observed that the performance of Model 2 (best feature guidance model) and Model 1 (color-only model) was worse in the second set of experiments compared to the first set. This confirms that the advantages of these two models in the first set may have stemmed from the fact that, more often than not, color produced stronger guiding signals than shape or texture in that stimulus set. Once we better controlled for the strength of guiding signals across the three dimensions, these two models lost their advantages and R<sups>2</sups> dropped from .85–.86 in Set 1 to .56–.76 in Set 2. In the second set, models that incorporate signals from all three dimensions gained an advantage (with R<sups>2</sups> remaining in almost identical ranges of .82–.91 and .82–.9 for the two sets), suggesting that they better captured the underlying structure of the feature combination rule.

<bold>Optimal Minkowski’s r</bold>

The change in priority allocated to the three dimensions is also evident in the optimal values of Minkowski’s r. In the first set of experiments, the optimal value of Minkowski’s r was 13.3, indicating that the leading dimension (mainly color) received a disproportionately large weight in contributing to the overall attentional guidance. In contrast, the optimal Minkowski’s r was 1.5 in the second set of experiments, suggesting that there was only a moderate prioritization of the leading dimension, and participants attended to all visual dimensions in a more even manner. Moreover, since distinctiveness signals were roughly matched across the three dimensions, the leading dimension in each tridimensional condition varied, depending on which dimension provided the largest signal. Further, because color was no longer the primary leading dimension, the higher weights associated with color in this set of experiments did not reflect the guiding advantage of color in this stimulus set; rather, the higher color weights in Set 2 suggest that observers seem to have an innate preference for color.

<bold>Winning Models</bold>

Interestingly, the weighted three-way orthogonal model and weighted color collinear–shape/texture orthogonal model were the top two performing models across both data sets (see Figures 4 and 8), despite differences in the stimuli and experimental conditions (online for the first set vs. in-lab for the second set). Furthermore, the relative changes in weights in these models as well as the change in Minkowski’s r values across the two data sets suggest that when there is a visual dimension that provides a stronger guiding signal than the others, information along that dimension is prioritized, whereas, when the guiding signals are comparable across all dimensions, participants tend to more evenly consider information across all three dimensions. That being said, it is interesting to note that a systematic bias toward color information remained in the second data set.

Additional Analyses: Reliability Analyses of the Winning Models Across Two Sets


>

Across two data sets, we demonstrated that two models, namely, the weighted color collinear–shape/texture orthogonal model in the first set of experiments and the weighted three-way orthogonal model in the second set of experiments, had great predictive power and were substantially more likely than any other candidate models. Consequently, the mathematical laws underlying feature combination appear to vary as a function of the ecology of the experimental conditions. This dependence on the experimental conditions limits the generalizability of the results.

To better understand how the winning model varies with experimental conditions, we bootstrapped the data sets to obtain the range of the two top-performing models’ R<sups>2</sups> values. We sampled trials with replacement from each condition 100 times to examine the reliability of the winning model within participants (Figure 10, top). The results showed that in fact, for the first data set, the weighted three-way orthogonal model achieves a higher R<sups>2</sups> than the weighted color collinear–shape/texture orthogonal model 76 out of 100 times (averaged values and confidence intervals of R<sups>2</sups> are reported in Table 11). For the second data set, the weighted three-way orthogonal model has a higher R<sups>2</sups>, 97 out of 100 times. AIC comparison shows that on average, the weighted three-way orthogonal model is 363 times more likely than the weighted color collinear model in Set 1, and 3.3 × 10<sups>7</sups> times in Set 2, to account for the variability in the observed data. These results suggest that the weighted three-way orthogonal model might represent a more universal model, underlying both sets of tridimensional search scenarios.
>
><anchor name="fig10"></anchor>xge_155_3_839_fig10a.gif
>
><anchor name="tbl11"></anchor>xge_155_3_839_tbl11a.gif

We also sampled participants with replacement for 100 times to examine the reliability of the winning model across participants (Figure 10, bottom). The results showed the weighted three-way orthogonal model achieves a higher R<sups>2</sups> for 90 and 84 out of 100 times in the two sets, respectively (see Table 11 and Figure 10). AIC comparison shows that on average, the weighted three-way orthogonal model is 2.2 × 10<sups>4</sups> times more likely than the weighted color collinear model in Set 1, and 2.4 × 10<sups>8</sups> times in Set 2, to account for the variability in the observed data. This suggests an advantage of the weighted three-way orthogonal model compared to the weighted color collinear model in accounting for the data at the participant level, to a larger extent compared to the aggregated data shown in the Results section.

The conclusion that an orthogonal combination model might be more universal is also supported by recent findings from Hughes et al. (2024), who also found stronger evidence for an orthogonal combination model when using a Bayesian modeling technique to evaluate feature combination rules at the participant level, when evaluating two dimensions (color and shape). That said, as was mentioned in Hughes et al., we acknowledge that it is difficult from a one-time calculation over the participants’ aggregated data to decisively conclude which model is the best, given variability within participants and between participants and variability induced by stimuli selection. The two winning models produce very similar R<sups>2</sups> and AIC values during model comparison, and this was particularly true in the first set of experiments. However, this bootstrapping analysis gives us some measure of confidence that the weighted three-way orthogonal model is the most robust and generalizable, given our current data and experimental materials.

General Discussion


>

Visual search in real life is almost always multidimensional. Looking for a phone among laptops, keyboards, books, and mugs, which differ in color, shape, texture, size, etc., happens much more often than looking for an object that differs from surrounding objects along only one specific feature, like size or color. When visual information is available along multiple dimensions, how does the human visual system utilize these various sorts of information to find a target? Does the visual system utilize them all? Do we prioritize some dimensions and deprioritize others? Is a multidimensional search more efficient than comparable unidimensional searches, or does the additional informational load in multidimensional searches incur a processing cost? The results from the present study provide us with an initial set of answers to these questions. Yes, the visual system can use all the information available, along at least three feature dimensions, to guide attention. Furthermore, the attentional guidance system values information along some visual dimensions, like color, more than others, like shape and texture. Finally, multidimensional search is more efficient than comparable unidimensional searches. These answers represent an initial stepping stone toward understanding attentional guidance in real-life search scenarios.

Previous research had demonstrated that, in bidimensional searches, where the target differed from distractors along two dimensions, distinctiveness signals from each dimension combine in a lawful manner to guide attention (e.g., Buetti et al., 2019; Hughes et al., 2024; Xu, Lleras, & Buetti, 2021). The present study pushed these initial investigations further to understand how distinctiveness signals along three feature dimensions—specifically, color, shape, and surface texture—integrate to produce overall top-down guidance in tridimensional searches. Across two sets of experiments, we also tested how the feature integration rules vary with different stimuli. In the first set, color tended to provide larger guidance signals; in the second set, guidance from the three feature dimensions was more comparable. As a reminder, in tridimensional searches, the target never shared any features with the distractors; therefore, the visual signals from all three feature dimensions differentiated the target from the distractors. Several findings emerged from this study.

First, adding distinctiveness signals from multiple feature dimensions increases search efficiency. Across both sets of experiments, we observed that tridimensional search is more efficient than unidimensional searches using the same features. For instance, the tridimensional search for a red octagon with a cross texture among orange solid squares produced a smaller slope (25.7 ms/log unit, Experiment 4) compared to the unidimensional color search for a red item among orange items (57.8 ms/log unit, Experiment 1), shape search for an octagon among squares (48.2 ms/log unit, Experiment 2), and texture search for a cross among solid textures (37.6 ms/log unit, Experiment 3). This general pattern of results is visualized in Figure 11, where the smallest of the three unidimensional slopes is plotted against the observed corresponding tridimensional slope. One can note that over 94% of the data points (51 out of 54) are above the y = x curve, indicating smaller slope values (i.e., faster search) in the tridimensional search conditions. This finding indicates that the processing of color, shape, and texture information is likely happening in parallel and that processing signals along any one of the dimensions neither block, postpone, nor slow down the visual system’s processing of others’ visual dimensions. Instead, having multiple dimensions simultaneously providing guidance signals facilitates the overall search speed. Indeed, functional magnetic resonance imaging studies have shown that the processing of color, shape, and texture differences happens in separate channels at distinct brain areas, with the processing of shape differences frequently found at the lateral occipital area, the anterior collateral sulcus region being selective for color processing, and the posterior collateral sulcus being selective for processing of surface texture information (e.g., Cant & Goodale, 2007; Cant et al., 2009; Cavina-Pratesi et al., 2010; Mayer & Vuong, 2013). It is therefore likely that the multidimensional guidance signal, which emerges as a combination of the three unidimensional guidance signals, happens later in the visual processing stream than these dimension-specific regions in the visual system. Thus, the current results should inform future investigations on the neural locus of the multidimensional guidance signal, particularly given the finding that receiving more signal from more channels leads to systematic increases in the overall magnitude of the guidance signal and, thus, quickens search.
>
><anchor name="fig11"></anchor>xge_155_3_839_fig11a.gif

Second, visual signals across the dimensions of color, shape, and texture all contribute to overall guidance during tridimensional searches. The top-performing model in each set of experiments indicated that signals from all three dimensions combined to guide the search, albeit to different degrees. Furthermore, across both data sets, the same two models performed the best: Model 7 (weighted three-way orthogonal combination model; achieving an R<sups>2</sups> above 90% in both data sets) and Model 9 (weighted color collinear–shape/texture orthogonal model; achieving an R<sups>2</sups> above 88% in both data sets). Models considering guidance from only one dimension consistently underperformed in both sets of experiments (i.e., Model 1, the color-only model, and Model 2, the best feature guidance model).

These findings offer insights into the properties of target templates. They suggest that, at least for the three dimensions tested here, the target template representation contains information along all of them and that they are all utilized by the human visual system to guide search. This conclusion seems to stand in contrast with Williams (1967), where the author concluded that participants’ fixations during search were mainly guided by color when the target was defined along two (i.e., color and shape, or color and size) or three dimensions (i.e., color, shape, and size).

There are several possible ways of reconciling Williams’s and the current results. In Williams (1967), stimuli consisted of forms in specific colors, shapes, and sizes, each containing a two-digit number. Participants searched for a target number and were provided with a verbal description containing varying amounts of information regarding the color, shape, and size of the associated form right before each search trial. The search displays were always heterogeneously composed of 100 stimuli, each defined by a unique combination of colors, shapes, and sizes. Because the target information changed randomly throughout the experiment, there was no fixed visual target template when participants performed the search. Previous research has shown that search slows down when the target template is verbally presented rather than visually shown (e.g., Malcolm & Henderson, 2009) and when the target template (especially the feature dimensions defining the target) changes across trials (e.g., Krummenacher et al., 2001; Lleras et al., 2025; Müller et al., 2003). Due to the varying nature of the target in Williams’s study, it may have been difficult for participants to build a stable and useful target template. Consequently, participants might have adopted a minimal effort strategy (Irons & Leber, 2016, 2020), using the easiest feature dimension to narrow down the possible target locations, then serially fixating on each item until finding the target. In contrast, in our task, the target was defined by a fixed color, shape, and texture throughout the task, making it easier for participants to create and maintain a stable target template containing information from all three feature dimensions to guide the search.

Also, the stimuli selection in Williams’s study might have been biased toward the color dimension. In Williams’s study, search was faster when the target was specified by color (mean time = 7.6 s) compared to when it was specified by size (mean time = 16.4 s) or shape (mean time = 20.7 s), and as a baseline, searching based on the target number alone produced a mean time of 22.8 s. These results reflect a situation where color information might have been prioritized over the other dimensions simply because it carried larger contrast signals than the other dimensions, and it likely minimized the contribution of the latter two dimensions to attentional guidance. In the present study, we ran two separate experimental sets manipulating the relative usefulness of the color dimension, and we were able to isolate the color advantage produced by the specific color features from any inherent preference toward the color dimension in the human visual system.

Additionally, our computational methodology is likely more sensitive in detecting the contributions of various features to guidance, even in cases where one visual dimension appears more useful than the others (as observed in the first set of experiments). It is possible that Williams’s (1967) method, which primarily involved observing where the majority of eye fixations occurred, was not sensitive enough to detect contributions from other dimensions.

Another crucial finding from the present study is that, although all three feature dimensions contribute to overall guidance, the extents of their contributions are not always equal but, in fact, dynamically change as a function of the relative usefulness of those dimensions. Because the weighted models generally outperformed nonweighted models, we can conclude that different feature dimensions contributed to guidance in differing amounts. Color received a larger weight compared to shape and texture, particularly in the first set of experiments. In the second set, the three dimensions contributed to the guidance in a more balanced way, although color still had an advantage over the other two. The Minkowski r-metric model also provides confirmatory evidence, with an r of 13.3 in the first set of experiments and a comparatively smaller value of 1.5 in the second set, suggesting a more balanced contribution of the three dimensions in the latter compared to the former. Further, this change in dimensional weights across the two sets corresponds to the change in the magnitude of dimensional contrast signal. In the first set of experiments, there was a greater imbalance in terms of the contrast signal magnitude along the three dimensions, with color often providing larger guiding signals compared to shape and texture, compared to the second set where the unidimensional color search efficiencies were better matched to those in shape and texture. These results demonstrate the flexibility and adaptability of attentional modulations in feature combination during search: The human visual system dynamically prioritizes feature dimensions according to their relative usefulness. The prioritization each dimension receives seems to depend on the relative strength of signals provided by that dimension compared to other dimensions in a specific situation. The more useful dimensions are more strongly prioritized compared to the less useful dimensions, although under the constraints of other factors such as innate preference toward specific dimensions (discussed later in this section), top-down expectations (Grubert et al., 2011; Xu, Lleras, Gong, & Buetti, 2024), and dimension predictability (Witkowski & Geng, 2022; Xu et al., 2025).

There is evidence in both behavioral and neuroscientific studies supporting the idea that people adjust the attentional weight associated with a feature dimension based on its relative usefulness (e.g., Found & Müller, 1996; Grubert et al., 2011; Lee & Geng, 2020; Müller et al., 2003; Xu, Lleras, Gong, & Buetti, 2024; X. Yu & Geng, 2019; J. M. Yu et al., 2025). Behaviorally, Xu, Lleras, Gong, and Buetti (2024) found that a simple instruction manipulation influenced the degree of attention paid to color and shape. Using a singleton search paradigm, Grubert et al. (2011) also found that people searched faster for a bidimensional target than a unidimensional one, but this benefit was stronger when the participants knew they were looking for a bidimensional target compared to when they were expecting a unidimensional target. This indicates that the same stimuli were better utilized when preparing to receive signals from both color and shape dimensions than when preparing to receive a difference signal along only one dimension. Neurologically, Töllner et al. (2008) showed that the N2pc component (a covert attention indicator) appeared earlier in dimension-repetition trials (where the target differed from distractors along the same dimension as the previous trial) compared to dimension-switching trials, which indicates an attentional focus change due to an intertrial effect (also see Gramann et al., 2010; Töllner et al., 2010). Using functional magnetic resonance imaging, Pollmann et al. (2000) found that in an oddball search task, when the target is defined in the same dimension within a block, the associated brain areas are activated to a higher level compared to when the target defining dimension changed across trials, indicating the possibility of attentional focus enhancement on particular feature dimensions. Finally, such finding of dynamic weight adjustment also aligns nicely with the concept of weight in artificial neural networks, which adjusts dynamically based on new input to maximize performance (e.g., Krizhevsky et al., 2017; LeCun et al., 2015; Rumelhart et al., 1986; Thakur & Peethambaran, 2020).

<h31 id="xge-155-3-839-d543e2672">On the Uniqueness of Color as a Guiding Feature</h31>

Our results showed that there appears to be an inherent preference for attending to color information in tridimensional searches. The higher emphasis placed on color was evident across both sets of experiments when color generally provided a larger contrast signal than the other two dimensions (Set 1) and when color provided approximately the same degree of featural contrast as shape and texture (Set 2). Since the weight represents how much of the associated unidimensional contrast signal contributes to the tridimensional search guidance, the fact that the color weight (or a combined color and shape weight) value is larger than 1 across all the weighted models indicates an inherent preference or prioritization for the color dimension. This observed preference for color is consistent with earlier findings that people tend to prioritize color signals (e.g., Alexander et al., 2019; Hulleman, 2020; Williams, 1967; Xu, Lleras, Gong, & Buetti, 2024). For instance, Xu, Lleras, Gong, and Buetti (2024) found that regardless of the dimension(s) participants were instructed to focus on (color, shape, or both), the attentional weight associated with the color dimension consistently remained above 1 (and the weight associated with shape was below 1), indicating a persistent prioritization of color regardless of search instructions. These findings on color preference align well with Conway’s (2014) proposal that color is a privileged perceptual dimension that is meant to index interest in the visual world. Color information is also likely more robust to optical and perceptual transformations than shape information. For instance, color information can survive changes in accommodation that typically blur the shape of objects that are beyond the depth of field of the current fixation. Color information can also help recover identity information of low-resolution objects presented in the periphery (e.g., Castelhano & Henderson, 2008; Oliva & Schyns, 2000; Oliva & Torralba, 2001; Rousselet et al., 2005; Torralba, 2009; Wurm et al., 1993).

<h31 id="xge-155-3-839-d543e2714">On Model Performance Indices</h31>

In both sets of experiments, to indicate model performance, we prioritized the index of R<sups>2</sups>, which represents the extent to which a model successfully captures a trend in the observed data that are directly tied to experimental manipulations (e.g., that RT increases with set size and with target–distractor similarity), and the AIC metric, which allows us to evaluate the relative likelihood of any given model over another. When regressing the observed RTs onto the predicted ones (Equation 3), R<sups>2</sups> captures the precision of a model’s predictions, whereas the slope and intercept of the function indicate how accurately the average prediction is from the observed value. Therefore, even a model that captures almost the entirety of the observed variability in the data (R<sups>2</sups> close to 1) might systematically over- or underpredict the observed values. That is, one can observe models with high R<sups>2</sups> values but slopes distant from 1 and intercepts distant from 0.

Here, we observed a prediction slope of 1.03 and an intercept of 4.22 in Set 1, which would indicate a relatively perfect prediction. However, in Set 2, the prediction slope of the winning model was 0.73 and the intercept was 171.74. These values deviate systematically from a perfect prediction, yet the high R<sups>2</sups> value indicates that the model is accurately tracking variations in the observed data.

We should note that we are not overly concerned by the fitted intercept values larger than zero (as was the case in Set 2). Such a constant offset in prediction times is likely to reflect non-search-related processes, such as differences in perceptual encoding, response selection, and response execution. More likely in our experiments, this offset might be capturing differences in overall RTs between groups, since unidimensional and tridimensional searches were conducted on different groups of participants.

The fitted slope distant from 1 could be seen as a “correction” to the predicted search slope LS, reflecting the extent to which the slope measured in unidimensional searches underpredicts the speed of contrast accumulation in multidimensional searches. Such a misalignment indicates that processing efficiency is higher in the multidimensional search compared to the unidimensional search. In fact, this is what we proposed in Xu, Lleras, and Buetti (2021): When shape and texture combine, there is an overall prediction slope of 0.75, meaning that participants searched faster than we predicted by a factor of 3/4 in their search time, which might arise from coactivation when combining signals from different dimensions. In Set 2, the weighted three-way orthogonal model achieves the highest R<sups>2</sups> of 0.9, but its prediction needs to be “corrected” by a prediction slope of 0.73. This result indicates that there might be a 37% increase in the efficiency (1.37 is the inverse of 0.73) with which evidence is being gathered across multiple dimensions compared to single dimensions, without necessarily impacting the architecture in which that information is combined across dimensions. It is worth noting that, in other contexts using an analogous prediction methodology, the winning models using unidimensional slopes have been found to instead overpredict the speed of information processing, as is the case when trying to predict processing speeds in heterogeneous search displays based on homogeneous search times (see Cui et al., 2025; Lleras et al., 2019; Wang et al., 2017; Xu, Lleras, Shao, & Buetti, 2021). This has been taken as evidence that processing efficiency decreases when distractor heterogeneity increases.

<h31 id="xge-155-3-839-d543e2768">Potential Limitations and Future Directions</h31>

A limitation of the current methodology comes from the fact that incorporating signals from three dimensions almost invariably results in very rapid search, producing shallow tridimensional search slopes and thereby leaving limited room to observe differences in modeling. That being said, our best performing models did account for over 90% of the available variance in the data. Next steps might involve a continued exploration of feature space with the goal of selecting stimuli that yield smaller unidimensional distinctiveness (i.e., steeper unidimensional slopes), ensuring a larger range of overall distinctiveness when combining the three dimensions.

Furthermore, in the present study, attentional weights were estimated from the observed tridimensional search data, rather than determined a priori, which limits the predictive power of the models. However, the split-half analysis in both Sets 1 and 2 suggests that the weight parameters are stable estimates of the prioritization people place on different feature dimensions. Moreover, changes in these weight values across different experimental sets (see Figure 12) reflect modifications in the properties of the search stimuli rather than mere data noise. These results open possibilities for making a priori predictions of the attentional weights. Moving forward, we aim to develop methods for quantitatively predicting the values of attentional weights based on contextual factors, such as the relative usefulness of different feature dimensions, the innate preference for certain dimensions, or the top-down emphasis placed on specific dimensions (as demonstrated in Xu, Lleras, Gong, & Buetti, 2024). By making a priori predictions about how the human visual system dynamically modulates attentional weights as a function of these contextual factors, we can enhance our ability to forecast human behavior in more complex and realistic search scenarios.
>
><anchor name="fig12"></anchor>xge_155_3_839_fig12a.gif

Additionally, modeling in the present study was performed on aggregated data (i.e., individual averaged RTs were computed across trials, and then group averaged RTs were computed across participants). This approach ensures stable search slope and RT estimates, but it comes at the cost of not being able to predict variability between participants. That is, we were unable to model how any given participant’s unidimensional slopes combined to predict tridimensional slopes at the individual level. (However, we did include both participant-level and trial-level bootstrapping as an additional analysis to examine the variation in the winning models’ performance to mimic between- and within-participant variability.) In other words, the current modeling results are more representative of an average participant, rather than of any one participant. Future studies could use a multilevel modeling approach (e.g., Hughes et al., 2024) to better separate the contribution of the manipulated task factors (e.g., target–distractor distinctiveness, set size) from participant- and trial-level variability, which allows for a better modeling of individual differences and intertrial variation in search behavior.

Finally, in this study, target–distractor distinctiveness was indexed by the participants’ logarithmic search slope, rather than measured on a calibrated perceptual space. While we did use CIELAB space to select the colors in Set 2 (colors were taken from an iso-lightness color circle in the CIELAB space), no similar feature space was used to aid stimuli selection of shape or texture, since there is no calibrated perceptual space for measuring similarity along shape or texture. This is why, in the present study, we chose to use search efficiency as an operational definition of the perceived featural difference between two stimuli. This is not too far-fetched because search efficiency is (theoretically and empirically) related to perceptual similarity (see Duncan & Humphreys, 1989, 1992, for the theoretical link). Indeed, in a recent study from our laboratory (Lleras et al., 2025), we studied the direct relationship between search efficiency and perceptual similarity in color space, using the CIELAB space, which is a calibrated perceptual space for color. A similar relationship between search performance and color similarity was also found in Chapman and Störmer (2024), where the authors demonstrated that the search slope is directly related to the inverse of the color distance in the CIELAB space, and in Chapman and Störmer (2022), where they observed a relationship between the search RT and color distance; albeit, these authors found these relationships at the higher end of the similarity space. It will be important to continue to study and understand the relationship between search slopes and perceptual similarity across different feature dimensions.

<h31 id="xge-155-3-839-d543e2803">Constraints on Generality</h31>

This study was conducted with undergraduate students at the University of Illinois, Urbana-Champaign, as well as participants recruited from Prolific, an online data collection platform frequently used in psychological research. Participants were aged 18–30 years, required to have normal visual acuity and color vision, and included 201 males, 526 females, 13 nonbinary individuals, and one participant who chose not to respond. The reported findings should generalize well to the general population within a similar age range and across genders, with normal visual acuity and color vision.

Conclusion


>

The present study demonstrated that visual distinctiveness signals from color, shape, and texture all contributed to predict search performance in tridimensional search. The modeling suggested that the distinctiveness signals across these dimensions combine in a weighted three-way orthogonal manner to determine the overall distinctiveness that guides tridimensional search. We also quantitatively estimated the attentional weight parameter for each feature dimension, which captured the extent to which people prioritize the signal from that specific dimension. Finally, by manipulating the usefulness of the color dimension relative to shape and texture, we showed that people have an inherent preference for using color information to guide search. In addition to that preference, the relative usefulness of each feature dimension also influences the extent to which any one dimension is prioritized in a given search scenario.

This study provides not only scientific evidence regarding how vision manages complex, multidimensional signals but also a framework for modeling complex task performance using simpler ones. Our findings also contribute to applied fields such as product design. Understanding how humans process visual information can inform the creation of more visually intuitive and user-friendly products, such as by incorporating the most efficiently combined feature dimensions in visual elements and prioritizing important information using the inherently preferred dimension of color. The conclusions drawn from this study also have implications for the development of more biologically accurate neural networks to better understand and predict human behaviors on a larger scale.

Footnotes

<anchor name="fn1"></anchor>

<sups> 1 </sups> A pilot color search experiment containing eight participants was run to select the target and distractor colors that produced slopes in the similar range as shape and texture, which stayed the same across Set 1 and Set 2.

<anchor name="fn2"></anchor>

<sups> 2 </sups> Note that although Sets 1 and 2 used the same shape and texture features, there were some critical differences between the two sets of experiments that complicate direct comparison of slope values across the two sets. First, the search grid was larger in Set 2 (27 × 27 cm) compared to Set 1 (15 × 15 cm), resulting in greater average target eccentricity in Set 2. Eccentricity is known to slow down search efficiency (e.g., Carrasco et al., 1995; Carrasco & Frieder, 1997; Wang et al., 2018). This effect is especially pronounced in more difficult conditions (i.e., those eliciting steeper slopes), where low target–distractor discriminability makes it particularly challenging to identify targets in the far periphery. In easier conditions (e.g., distractors such as squares, triangles, and solid textures), the large discriminability signals may reduce the impact of larger eccentricity on performance. Second, the stimuli in the shape-only and texture-only conditions were gray in Set 1, but red in Set 2. It is unclear how this signal may have impacted search efficiency, but it is worth pointing out that the unidimensional perceptual comparisons were different across the two sets.

References

<anchor name="c1"></anchor>

Adeli, H., Vitu, F., & Zelinsky, G. J. (2017). A model of the superior colliculus predicts fixation locations during scene viewing and visual search. The Journal of Neuroscience, 37(6), 1453–1467. 10.1523/JNEUROSCI.0825-16.2016

<anchor name="c2"></anchor>

Alexander, R. G., Nahvi, R. J., & Zelinsky, G. J. (2019). Specifying the precision of guiding features for visual search. Journal of Experimental Psychology: Human Perception and Performance, 45(9), 1248–1264. 10.1037/xhp0000668

<anchor name="c3"></anchor>

Appelbaum, M., Cooper, H., Kline, R. B., Mayo-Wilson, E., Nezu, A. M., & Rao, S. M. (2018). Journal article reporting standards for quantitative research in psychology: The APA Publications and Communications Board Task Force Report. American Psychologist, 73(1), 3–25. 10.1037/amp0000191

<anchor name="c4"></anchor>

Buetti, S., Cronin, D. A., Madison, A. M., Wang, Z., & Lleras, A. (2016). Towards a better understanding of parallel visual processing in human vision: Evidence for exhaustive analysis of visual information. Journal of Experimental Psychology: General, 145(6), 672–707. 10.1037/xge0000163

<anchor name="c5"></anchor>

Buetti, S., Xu, J., & Lleras, A. (2019). Predicting how color and shape combine in the human visual system to direct attention. Scientific Reports, 9(1), Article 20258. 10.1038/s41598-019-56238-9

<anchor name="c6"></anchor>

Bundesen, C. (1990). A theory of visual attention. Psychological Review, 97(4), 523–547. 10.1037/0033-295X.97.4.523

<anchor name="c7"></anchor>

Cant, J. S., Arnott, S. R., & Goodale, M. A. (2009). fMR-adaptation reveals separate processing regions for the perception of form and texture in the human ventral stream. Experimental Brain Research, 192(3), 391–405. 10.1007/s00221-008-1573-8

<anchor name="c8"></anchor>

Cant, J. S., & Goodale, M. A. (2007). Attention to form or surface properties modulates different regions of human occipitotemporal cortex. Cerebral Cortex, 17(3), 713–731. 10.1093/cercor/bhk022

<anchor name="c93"></anchor>

Carrasco, M., Evert, D. L., Chang, I., & Katz, S. M. (1995). The eccentricity effect: Target eccentricity affects performance on conjunction searches. Perception & Psychophysics, 57(8), 1241–1261. 10.3758/BF03208380

<anchor name="c94"></anchor>

Carrasco, M., & Frieder, K. S. (1997). Cortical magnification neutralizes the eccentricity effect in visual search. Vision Research, 37(1), 63–82. 10.1016/S0042-6989(96)00102-2

<anchor name="c9"></anchor>

Castelhano, M. S., & Henderson, J. M. (2008). The influence of color on the perception of scene gist. Journal of Experimental Psychology: Human Perception and Performance, 34(3), 660–675. 10.1037/0096-1523.34.3.660

<anchor name="c10"></anchor>

Cavina-Pratesi, C., Kentridge, R. W., Heywood, C. A., & Milner, A. D. (2010). Separate channels for processing form, texture, and color: Evidence from FMRI adaptation and visual object agnosia. Cerebral Cortex, 20(10), 2319–2332. 10.1093/cercor/bhp298

<anchor name="c11"></anchor>

Chapman, A. F., & Störmer, V. S. (2022). Feature similarity is non-linearly related to attentional selection: Evidence from visual search and sustained attention tasks. Journal of Vision, 22(8), Article 4. 10.1167/jov.22.8.4

<anchor name="c12"></anchor>

Chapman, A. F., & Störmer, V. S. (2024). Target-distractor similarity predicts visual search efficiency but only for highly similar features. Attention, Perception, & Psychophysics, 86(6), 1872–1882. 10.3758/s13414-024-02954-y

<anchor name="c13"></anchor>

Chun, M. M., & Wolfe, J. M. (1996). Just say no: How are visual searches terminated when there is no target present?Cognitive Psychology, 30(1), 39–78. 10.1006/cogp.1996.0002

<anchor name="c14"></anchor>

Conway, B. R. (2014). Color signals through dorsal and ventral visual pathways. Visual Neuroscience, 31(2), 197–209. 10.1017/S0952523813000382

<anchor name="c15"></anchor>

Cui, A. Y., Buetti, S., Xu, Z. J., & Lleras, A. (2025). Evaluating the contribution of parallel processing of color and shape in a conjunction search task. Scientific Reports, 15(1), Article 7760.10.1038/s41598-025-92453-3

<anchor name="c16"></anchor>

Duncan, J., & Humphreys, G. (1992). Beyond the search surface: Visual search and attentional engagement. Journal of Experimental Psychology: Human Perception and Performance, 18(2), 578–588. 10.1037/0096-1523.18.2.578

<anchor name="c17"></anchor>

Duncan, J., & Humphreys, G. W. (1989). Visual search and stimulus similarity. Psychological Review, 96(3), 433–458. 10.1037/0033-295X.96.3.433

<anchor name="c18"></anchor>

Eckstein, M. P., Thomas, J. P., Palmer, J., & Shimozaki, S. S. (2000). A signal detection model predicts the effects of set size on visual search accuracy for feature, conjunction, triple conjunction, and disjunction displays. Perception & Psychophysics, 62(3), 425–451. 10.3758/BF03212096

<anchor name="c19"></anchor>

Folk, C. L., Remington, R. W., & Johnston, J. C. (1992). Involuntary covert orienting is contingent on attentional control settings. Journal of Experimental Psychology: Human Perception and Performance, 18(4), 1030–1044. 10.1037/0096-1523.18.4.1030

<anchor name="c20"></anchor>

Found, A., & Müller, H. J. (1996). Searching for unknown feature targets on more than one dimension: Investigating a “dimension-weighting” account. Perception & Psychophysics, 58(1), 88–101. 10.3758/BF03205479

<anchor name="c21"></anchor>

Garner, W. R. (1974). The processing of information and structure. Lawrence Erlbaum Associates. 10.4324/9781315802862

<anchor name="c22"></anchor>

Garner, W. R., & Felfoldy, G. L. (1970). Integrality of stimulus dimensions in various types of information processing. Cognitive Psychology, 1(3), 225–241. 10.1016/0010-0285(70)90016-2

<anchor name="c23"></anchor>

Gaspelin, N., & Luck, S. J. (2018). Distinguishing among potential mechanisms of singleton suppression. Journal of Experimental Psychology: Human Perception and Performance, 44(4), 626–644. 10.1037/xhp0000484

<anchor name="c24"></anchor>

Gramann, K., Töllner, T., & Müller, H. J. (2010). Dimension-based attention modulates early visual processing. Psychophysiology, 47(5), 968–978. 10.1111/j.1469-8986.2010.00998.x

<anchor name="c25"></anchor>

Grubert, A., Krummenacher, J., & Eimer, M. (2011). Redundancy gains in pop-out visual search are determined by top-down task set: Behavioral and electrophysiological evidence. Journal of Vision, 11(14), Article 10. 10.1167/11.14.10

<anchor name="c26"></anchor>

Hamblin-Frohman, Z., & Becker, S. I. (2021). The attentional template in high and low similarity search: Optimal tuning or tuning to relations?Cognition, 212, Article 104732. 10.1016/j.cognition.2021.104732

<anchor name="c27"></anchor>

Henderson, J. M., Malcolm, G. L., & Schandl, C. (2009). Searching in the dark: Cognitive relevance drives attention in real-world scenes. Psychonomic Bulletin & Review, 16(5), 850–856. 10.3758/PBR.16.5.850

<anchor name="c28"></anchor>

Hoffman, J. E. (1979). A two-stage model of visual search. Perception & Psychophysics, 25(4), 319–327. 10.3758/BF03198811

<anchor name="c29"></anchor>

Hughes, A. E., Nowakowska, A., & Clarke, A. D. (2024). Bayesian multi-level modelling for predicting single and double feature visual search. Cortex, 171, 178–193. 10.1016/j.cortex.2023.10.014

<anchor name="c30"></anchor>

Hulleman, J. (2020). Quantitative and qualitative differences in the top-down guiding attributes of visual search. Journal of Experimental Psychology: Human Perception and Performance, 46(9), 942–964. 10.1037/xhp0000764

<anchor name="c31"></anchor>

Hulleman, J., & Olivers, C. N. L. (2017). The impending demise of the item in visual search. Behavioral and Brain Sciences, 40, Article e132. 10.1017/S0140525X15002794

<anchor name="c32"></anchor>

Irons, J. L., & Leber, A. B. (2016). Choosing attentional control settings in a dynamically changing environment. Attention, Perception, & Psychophysics, 78(7), 2031–2048. 10.3758/s13414-016-1125-4

<anchor name="c33"></anchor>

Irons, J. L., & Leber, A. B. (2020). Developing an individual profile of attentional control strategy. Current Directions in Psychological Science, 29(4), 364–371. 10.1177/0963721420924018

<anchor name="c34"></anchor>

Itti, L., & Koch, C. (2000). A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research, 40(10–12), 1489–1506. 10.1016/S0042-6989(99)00163-7

<anchor name="c35"></anchor>

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84–90. 10.1145/3065386

<anchor name="c36"></anchor>

Krummenacher, J., Müller, H. J., & Heller, D. (2001). Visual search for dimensionally redundant pop-out targets: Evidence for parallel-coactive processing of dimensions. Perception & Psychophysics, 63(5), 901–917. 10.3758/BF03194446

<anchor name="c37"></anchor>

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. 10.1038/nature14539

<anchor name="c38"></anchor>

Lee, J., & Geng, J. J. (2020). Flexible weighting of target features based on distractor context. Attention, Perception, & Psychophysics, 82(2), 739–751. 10.3758/s13414-019-01910-5

<anchor name="c39"></anchor>

Liesefeld, H. R., Liesefeld, A. M., Pollmann, S., & Müller, H. J. (2018). Biasing allocations of attention via selective weighting of saliency signals: Behavioral and neuroimaging evidence for the dimension-weighting account. In T.Hodgson (Eds.), Processes of visuospatial attention and working memory: Current topics in behavioral neurosciences (Vol. 41, pp. 87–113). Springer. 10.1007/7854_2018_75

<anchor name="c40"></anchor>

Lleras, A., Buetti, S., & Xu, Z. J. (2022). Incorporating the properties of peripheral vision into theories of visual search. Nature Reviews Psychology, 1(10), 590–604. 10.1038/s44159-022-00097-1

<anchor name="c41"></anchor>

Lleras, A., Wang, Z., Madison, A., & Buetti, S. (2019). Predicting search performance in heterogeneous scenes: Quantifying the impact of homogeneity effects in efficient search. Collabra: Psychology, 5(1), Article 2. 10.1525/collabra.151

<anchor name="c42"></anchor>

Lleras, A., Wang, Z., Ng, G. J. P., Ballew, K., Xu, J., & Buetti, S. (2020). A target contrast signal theory of parallel processing in goal-directed search. Attention, Perception & Psychophysics, 82(2), 394–425. 10.3758/s13414-019-01928-9

<anchor name="c43"></anchor>

Lleras, A., Xu, Z. J., Tan, H. J. H., Shao, Y., & Buetti, S. (2025). Quantifying the relationship between search efficiency and perceptual similarity in color space across different efficient search tasks. Journal of Experimental Psychology: Human Perception and Performance, 51(7), 850–874. 10.1037/xhp0001327

<anchor name="c44"></anchor>

Madison, A., Lleras, A., & Buetti, S. (2018). The role of crowding in parallel search: Peripheral pooling is not responsible for logarithmic efficiency in parallel search. Attention, Perception, & Psychophysics, 80(2), 352–373. 10.3758/s13414-017-1441-3

<anchor name="c45"></anchor>

Malcolm, G. L., & Henderson, J. M. (2009). The effects of target template specificity on visual search in real-world scenes: Evidence from eye movements. Journal of Vision, 9(11), Article 8. 10.1167/9.11.8

<anchor name="c46"></anchor>

Mayer, K. M., & Vuong, Q. C. (2013). Automatic processing of unattended object features by functional connectivity. Frontiers in Human Neuroscience, 7, Article 193. 10.3389/fnhum.2013.00193

<anchor name="c47"></anchor>

Müller, H. J., Reimann, B., & Krummenacher, J. (2003). Visual search for singleton feature targets across dimensions: Stimulus- and expectancy-driven effects in dimensional weighting. Journal of Experimental Psychology: Human Perception and Performance, 29(5), 1021–1035. 10.1037/0096-1523.29.5.1021

<anchor name="c48"></anchor>

Najemnik, J., & Geisler, W. S. (2005). Optimal eye movement strategies in visual search. Nature, 434(7031), 387–391. 10.1038/nature03390

<anchor name="c49"></anchor>

Navalpakkam, V., & Itti, L. (2007). Search goal tunes visual features optimally. Neuron, 53(4), 605–617. 10.1016/j.neuron.2007.01.018

<anchor name="c50"></anchor>

Ng, G. J. P., Lleras, A., & Buetti, S. (2018). Fixed-target efficient search has logarithmic efficiency with and without eye movements. Attention, Perception, & Psychophysics, 80(7), 1752–1762. 10.3758/s13414-018-1561-4

<anchor name="c51"></anchor>

Nosofsky, R. M. (1986). Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology: General, 115(1), 39–57. 10.1037/0096-3445.115.1.39

<anchor name="c52"></anchor>

Oliva, A., & Schyns, P. G. (2000). Diagnostic colors mediate scene recognition. Cognitive Psychology, 41(2), 176–210. 10.1006/cogp.1999.0728

<anchor name="c53"></anchor>

Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3), 145–175. 10.1023/A:1011139631724

<anchor name="c54"></anchor>

Pollmann, S., Weidner, R., Müller, H. J., & von Cramon, D. Y. (2000). A fronto-posterior network involved in visual dimension changes. Journal of Cognitive Neuroscience, 12(3), 480–494. 10.1162/089892900562156

<anchor name="c55"></anchor>

Pramod, R. T., & Arun, S. P. (2014). Features in visual search combine linearly. Journal of Vision, 14(4), Article 6. 10.1167/14.4.6

<anchor name="c56"></anchor>

Pramod, R. T., & Arun, S. P. (2016). Object attributes combine additively in visual search. Journal of Vision, 16(5), Article 8. 10.1167/16.5.8

<anchor name="c57"></anchor>

Rosenholtz, R. (2016). Capabilities and limitations of peripheral vision. Annual Review of Vision Science, 2(1), 437–457. 10.1146/annurev-vision-082114-035733

<anchor name="c58"></anchor>

Rosenholtz, R., Huang, J., Raj, A., Balas, B. J., & Ilie, L. (2012). A summary statistic representation in peripheral vision explains visual search. Journal of Vision, 12(4), Article 14. 10.1167/12.4.14

<anchor name="c59"></anchor>

Rousselet, G., Joubert, O., & Fabre-Thorpe, M. (2005). How long to get to the “gist” of real-world natural scenes?Visual Cognition, 12(6), 852–877. 10.1080/13506280444000553

<anchor name="c60"></anchor>

Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533–536. 10.1038/323533a0

<anchor name="c61"></anchor>

Sawaki, R., & Luck, S. J. (2010). Capture versus suppression of attention by salient singletons: Electrophysiological evidence for an automatic attend-to-me signal. Attention, Perception, & Psychophysics, 72(6), 1455–1470. 10.3758/APP.72.6.1455

<anchor name="c62"></anchor>

Schurgin, M. W., Wixted, J. T., & Brady, T. F. (2020). Psychophysical scaling reveals a unified theory of visual memory strength. Nature Human Behaviour, 4(11), 1156–1172. 10.1038/s41562-020-00938-0

<anchor name="c63"></anchor>

Thakur, S., & Peethambaran, J. (2020). Dynamic edge weights in graph neural networks for 3D object detection. arXiv.

<anchor name="c64"></anchor>

Theeuwes, J. (1991). Cross-dimensional perceptual selectivity. Perception & Psychophysics, 50(2), 184–193. 10.3758/BF03212219

<anchor name="c65"></anchor>

Theeuwes, J. (1992). Perceptual selectivity for color and form. Perception & Psychophysics, 51(6), 599–606. 10.3758/BF03211656

<anchor name="c66"></anchor>

Töllner, T., Gramann, K., Müller, H. J., Kiss, M., & Eimer, M. (2008). Electrophysiological markers of visual dimension changes and response changes. Journal of Experimental Psychology: Human Perception and Performance, 34(3), 531–542. 10.1037/0096-1523.34.3.531

<anchor name="c67"></anchor>

Töllner, T., Zehetleitner, M., Gramann, K., & Müller, H. J. (2010). Top-down weighting of visual dimensions: Behavioral and electrophysiological evidence. Vision Research, 50(14), 1372–1381. 10.1016/j.visres.2009.11.009

<anchor name="c68"></anchor>

Torralba, A. (2009). How many pixels make an image?Visual Neuroscience, 26(1), 123–131. 10.1017/S0952523808080930

<anchor name="c69"></anchor>

Townsend, J. T., & Gregory Ashby, F. (1983). The stochastic modeling of elementary psychological processes. Cambridge University Press.

<anchor name="c70"></anchor>

Treisman, A., & Sato, S. (1990). Conjunction search revisited. Journal of Experimental Psychology: Human Perception and Performance, 16(3), 459–478. 10.1037/0096-1523.16.3.459

<anchor name="c71"></anchor>

Ullman, S. (1987). Visual routines. In M. A.Fischler & O.Firschein (Eds.), Readings in computer vision (pp. 298–328). Morgan Kaufmann. 10.1016/B978-0-08-051581-6.50035-0

<anchor name="c72"></anchor>

Vickery, T. J., King, L.-W., & Jiang, Y. (2005). Setting up the target template in visual search. Journal of Vision, 5(1), Article 8. 10.1167/5.1.8

<anchor name="c73"></anchor>

Wang, Z., Buetti, S., & Lleras, A. (2017). Predicting search performance in heterogeneous visual search scenes with real-world objects. Collabra: Psychology, 3(1), Article 6. 10.1525/collabra.53

<anchor name="c74"></anchor>

Wang, Z., Lleras, A., & Buetti, S. (2018). Parallel, exhaustive processing underlies logarithmic search functions: Visual search with cortical magnification. Psychonomic Bulletin & Review, 25(4), 1343–1350. 10.3758/s13423-018-1466-1

<anchor name="c75"></anchor>

Williams, L. G. (1967). The effects of target specification on objects fixated during visual search. Acta Psychologica, 27, 355–360. 10.1016/0001-6918(67)90080-7

<anchor name="c76"></anchor>

Witkowski, P. P., & Geng, J. J. (2022). Attentional priority is determined by predicted feature distributions. Journal of Experimental Psychology: Human Perception and Performance, 48(11), 1201–1212. 10.1037/xhp0001041

<anchor name="c77"></anchor>

Wolfe, J. M. (1994). Guided search 2.0 a revised model of visual search. Psychonomic Bulletin & Review, 1(2), 202–238. 10.3758/BF03200774

<anchor name="c78"></anchor>

Wolfe, J. M. (2021). Guided search 6.0: An updated model of visual search. Psychonomic Bulletin & Review, 28(4), 1060–1092. 10.3758/s13423-020-01859-9

<anchor name="c79"></anchor>

Wolfe, J. M., Horowitz, T. S., Kenner, N., Hyle, M., & Vasan, N. (2004). How fast can you change your mind? The speed of top-down guidance in visual search. Vision Research, 44(12), 1411–1426. 10.1016/j.visres.2003.11.024

<anchor name="c80"></anchor>

Wurm, L. H., Legge, G. E., Isenberg, L. M., & Luebker, A. (1993). Color improves object recognition in normal and low vision. Journal of Experimental Psychology: Human Perception and Performance, 19(4), 899–911. 10.1037/0096-1523.19.4.899

<anchor name="c81"></anchor>

Xu, Z. J., Buetti, S., & Lleras, A. (2020, May11). Predicting how color, texture and shape combine in the human visual system to direct attention. 10.17605/OSF.IO/P5TXF

<anchor name="c82"></anchor>

Xu, Z. J., Lleras, A., & Buetti, S. (2021). Predicting how surface texture and shape combine in the human visual system to direct attention. Scientific Reports, 11(1), Article 6170. 10.1038/s41598-021-85605-8

<anchor name="c83"></anchor>

Xu, Z. J., Lleras, A., & Buetti, S. (2024, June2). Predicting how color, texture and shape combine in the human visual system to direct attention. <a href="https://osf.io/bmwa4" target="_blank">https://osf.io/bmwa4</a>

<anchor name="c84"></anchor>

Xu, Z. J., Lleras, A., Gong, Z. G., & Buetti, S. (2024). Top-down instructions influence the attentional weight on color and shape dimensions during bidimensional search. Scientific Reports, 14(1), Article 31376. 10.1038/s41598-024-82866-x

<anchor name="c85"></anchor>

Xu, Z. J., Lleras, A., Shao, Y., & Buetti, S. (2021). Distractor-distractor interactions in visual search for oriented targets explain the increased difficulty observed in nonlinearly separable conditions. Journal of Experimental Psychology: Human Perception and Performance, 47(9), 1274–1297. 10.1037/xhp0000941

<anchor name="c86"></anchor>

Xu, Z. J., Yu, J., Lleras, A., & Buetti, S. (2025). Investigating the contribution of unpredictable target features to attentional guidance. Journal of Vision, 25(9), 2579. 10.1167/jov.25.9.2579

<anchor name="c87"></anchor>

Yu, J. M., Xu, Z. J., Lleras, A., & Simona, B. (2025). Exploring the impact of target-distractor featural contrast on feature prioritization in efficient visual search. Journal of Vision, 25(9), Article 2498. 10.1167/jov.25.9.2498

<anchor name="c88"></anchor>

Yu, X., & Geng, J. J. (2019). The attentional template is shifted and asymmetrically sharpened by distractor context. Journal of Experimental Psychology: Human Perception and Performance, 45(3), 336–353. 10.1037/xhp0000609

<anchor name="c89"></anchor>

Yu, X., Hanks, T. D., & Geng, J. J. (2022). Attentional guidance and match decisions rely on different template information during visual search. Psychological Science, 33(1), 105–120. 10.1177/09567976211032225

<anchor name="c90"></anchor>

Yu, X., Zhou, Z., Becker, S. I., Boettcher, S. E. P., & Geng, J. J. (2023). Good-enough attentional guidance. Trends in Cognitive Sciences, 27(4), 391–403. 10.1016/j.tics.2023.01.007

<anchor name="c91"></anchor>

Zelinsky, G. J. (2008). A theory of eye movements during target acquisition. Psychological Review, 115(4), 787–835. 10.1037/a0013118

<anchor name="c92"></anchor>

Zhang, W., & Luck, S. J. (2008). Discrete fixed-resolution representations in visual working memory. Nature, 453(7192), 233–235. 10.1038/nature06860

<h31 id="xge-155-3-839-d543e5902">APPENDIX</h31> <anchor name="A"></anchor> <h31 id="xge-155-3-839-d543e5903">APPENDIX A: Model Parameters Estimation</h31>

Given that the two top-performing models showed similar performance (i.e., the weighted color collinear–shape/texture orthogonal model achieves 0.7% higher R2 in the first set, and the weighted three-way orthogonal model achieves 1.6% higher R2 in the second set), we performed a post hoc power analysis, estimating the sample size necessary to observe a stable model difference.

For each tridimensional experiment, we sampled with replacement 50 times at each sample size, from 1 to 40, and calculated the model performance distinguishability as a function of sample size. Results were reported in Figure A4. Overall, R2 values of both models keep increasing with sample size, but model comparison results consistently favor the weighted three-way orthogonal model since the sample size is 2 (Figure A4, left). In Set 1, the weighted three-way orthogonal model is at least 1,040 times (at the sample size of 2) and on average 1.47 × 106 times more likely than the weighted color collinear–shape/texture orthogonal model in accounting for the variability in the data. In Set 2, the weighted three-way orthogonal model is at least 2.45 × 106 times (also at the sample size of 2) and on average 1.83 × 1010 times more likely than its component model.

To estimate power, for each simulation, we computed the relative likelihood of the winning model (i.e., the weighted three-way orthogonal model) compared to the other model (i.e., the weighted color collinear–shape/texture orthogonal model). We then reported the proportion of times the winning model’s relative likelihood was greater than 10, which serves as a reasonable cutoff point to conclude that there is robust evidence in favor of the winning model. See Figure A4 (right). For Set 1, the results showed that power increases quickly with sample size, reaching as much as 50% with as few as two subjects and going above 80% by a sample size of 7. The measure is noisy but stable between 0.7 and 0.9 over the range of 7–40. We interpret this as an adequate amount of power. For Set 2, there is a more systematic increase of power with sample size, with a clear positive trend that can be observed starting at around 7. Between 20 and 40, power varies between 0.67 and 0.88, which is again an adequate amount of power. Overall, these analyses suggest that we had gathered sufficient data to have confidence in our conclusions regarding the winning model.

For the weighted models, the weight estimates (see Figure A5) were stabilized around the sample size of 5 for the weighted color collinear–shape/texture orthogonal model and around 20 in Set 1 and around 10 in Set 2 for the weighted three-way orthogonal model. These results confirm that 20 participants should be sufficient for both distinguishing between the performance of the two top-performing models and obtaining stable weight estimates.

Submitted: August 30, 2024 Revised: October 30, 2025 Accepted: November 9, 2025

*