Abstract: Speech emotion recognition (SER) is a fundamental step towards fluent human-machine interaction. One challenging problem in SER is obtaining utterance-level feature representation for ...