Thanks for the information.
Might I ask what hardware you would consider sufficient for approx 1000 users, mostly 1:1 communications with maybe 100 rooms?
Do you have 1:1 video and voice calls enabled?
And do you have - by chance - any comparison to matrix?
Voice/Video-Calls belong to a strong and propper configured STUN/TURN-Server. Not to the xmpp-server (except you use ejabberd as stun/turn. Then you need a strong xmpp server).
And you need two of them with different ip-address for good functionality of turn/stun.
On Matrix i spent days of my life to find out, that both parts of a call need two good configured stun/turn-servers to establish fast a stable voice/video-connection. Espaecially they came from two different mobile/internet-providers.
It's never that clear in the dokus... had to find it out by myself.
I think, the rule eith two servers configured on both servers of a voice/video-call is also valuable for xmpp. Even they say no.
I also tested this a lot.