Recognition of emotion-based utterances from speech has been produced in a number of languages and utilized in various applications. This paper makes use of the spoken utterances corpus recorded in Urdu with different emotions of normal and special children. In this paper, the performance of learning classifiers is evaluated with prosodic and spectral features. At the same time, their combinations considering children with autism spectrum disorder (ASD) as noise in terms of classification accuracy has also been discussed. The experimental results reveal that the prosodic features show significant classification accuracy in comparison with the spectral features for ASD children with different classifiers, whereas combinations of prosodic features show substantial accuracy for ASD children with J48 and rotation forest classifiers. Pitch and formant express considerable classification accuracy with MFCC and LPCC for special (ASD) children with different classifiers.