Pathos

Speaker similarity preference

Added 2022-01-24 18:25:45 +0000 UTC

A new model is being developed that will hopefully provide several improvements. I am looking for some feedback and if I should prioritize some things. A speaker's identity is a rather complex thing and separating aspects of it is a non-trivial matter.

Given the option of high speaker similarity, i.e. little information about your own voice is kept and you sound more like the target voice, and the opposite, you sound less like the target but more information is kept, things like accent and pitch, pronunciation, and possibly emotion, which would you prefer?

Ideally, if needed, this would be a changeable parameter within the application, but otherwise, I will know what to prioritize. Thank you for your feedback!