Experimental and cross-linguistic studies have shown that vocal iconicity is prevalent in words that carry meanings related to SIZE and SHAPE. Although these studies demonstrate the importance of vocal iconicity and reveal the cognitive biases underpinning it, there is less work demonstrating how these biases lead to the evolution of a sound symbolic lexicon in the first place. In this study, we show how words can be shaped by cognitive biases through cultural evolution. Using a simple experimental setup resembling the game telephone, we examined how a single word form changed as it was passed from one participant to the next by a process of immediate iterated learning. About 1,500 naïve participants were recruited online and divided into five condition groups. The participants in the CONTROL-group received no information about the meaning of the word they were about to hear, while the participants in the remaining four groups were informed that the word meant either BIG or SMALL (with the meaning being presented in text), or ROUND or POINTY (with the meaning being presented as a picture). The first participant in a transmission chain was presented with a phonetically diverse word and asked to repeat it. Thereafter, the recording of the repeated word was played for the next participant in the same chain. The sounds of the audio recordings were then transcribed and categorized according to six binary sound parameters. By modelling the proportion of vowels or consonants for each sound parameter, the SMALL-condition showed increases of FRONT UNROUNDED vowels and the POINTY-condition increases of ACUTE consonants. The results show that linguistic transmission is sufficient for vocal iconicity to emerge, which demonstrates the role non-arbitrary associations play in the evolution of language.