Latest Seminars
Firms increasingly use a combination of image and text description when displaying products or engaging consumers. Existing research examined consumers' response to text and image separately, but has yet to systematically consider the semantic relationship between them. In this research, we examine how the congruence between image- and text-based product representation aects consumer preference by adopting a multi-method approach. First, to measure the image-text congruence, we propose a state-of-the-art Two-Branch Neural Networks model based on Wide-Residual-Networks (WRN) and BERT. We apply this deep-learning method to individual-level consumption data from an online reading platform and discover a U-shape eect for image-text congruence: consumers prefer a product when the image-text congruence is either high or low, but not in the medium level. We further conduct lab experiments to validate the causal eect of this nding and explore underlying mechanisms with an online study. Our study contributes to the literature of consumer information processing both methodologically and substantively, and it also provides crucial and actionable managerial implications to marketing practitioners and online content creators.