Abstract:
Recently proposed automatic pathological speech classification techniques use unsupervised auto-encoders to obtain a high-level abstract representation of speech.
Since these representations are learned based on reconstructing the input, there is no guarantee that they are robust to pathology-unrelated cues such as speaker identity information.
Further, these representations are not necessarily discriminative for pathology detection.
In this paper, we exploit supervised auto-encoders to extract robust and discriminative speech representations for Parkinson's disease classification.
To reduce the influence of speaker variabilities unrelated to pathology, we propose to obtain speaker identity-invariant representations by adversarial training of an auto-encoder and a speaker identification task.
To obtain a discriminative representation, we propose to jointly train an auto-encoder and a pathological speech classifier.
Experimental results on a Spanish database show that the proposed supervised representation learning methods yield more robust and discriminative representations for automatically classifying Parkinson's disease speech, outperforming the baseline unsupervised representation learning system.
Type:
CONFERENCE PAPER