Input tensor shape

[batch_size, Cin, num_feats, num_frames]