Jul 12, 2013

running multiple freeswitch instances on one host

Lately I had a nasty stability problem, which I'm still investigating.
If you are curious, freeswitch locks up, and log shows repeating message:
....
[CRIT] switch_core_session.c:1634 Thread Failure!
[CRIT] switch_core_session.c:1593 LUKE: I'm hit, but not bad.
[CRIT] switch_core_session.c:1594 LUKE'S VOICE: Artoo, see what you can do with it. Hang on back there....
Green laserfire moves past the beeping little robot as his head turns. After a few beeps and a twist of his mechanical arm,
Artoo reduces the max sessions to XXX thus, saving the switch from certain doom.
....
While the message is amusing, and suggests that system is trying to recover, restart is required to bring it back to service. To reduce downtime and recovery time we decided to run two FS instances on single machine, since host capacity is not maxed out. There are multiple ways of doing this, but to minimize configuration changes I tried to set each instance to listen on specific ip of a multi-homed host.



conf/vars.xml:
<X-PRE-PROCESS cmd="set" data="local_ip_v4=IP1"/>
conf-2/vars.xml:
<X-PRE-PROCESS cmd="set" data="local_ip_v4=IP2"/>

This solution did not work cleanly. Setting local_ip_v4 indeed makes freeswitch to listen on port 5060/5080 on specified ip, but both instances will collide listening on localhost:5060. Same problem with default administrative port localhost:8021. Changing them too:
conf/autoload_configs/event_socket.conf.xml:
<param name="listen-port" value="8021(change)"/>
conf/vars.xml:
<X-PRE-PROCESS cmd="set" data="internal_sip_port=5060(change)"/> 
Next problem happened with service startup script (freeswitch.init.redhat).
"status" command is showing both processes, which is a minor nuisance, but "kill" stops random/both instances!

Here is the patch to resolve this problem.

With modified init script, change of parameters per instance are set in "/etc/sysconfig/instance_name". Example:
/etc/sysconfig/freeswitch-2:
FS_FILE=/usr/local/freeswitch-2/bin/freeswitch
PID_FILE=/usr/local/freeswitch-2/run/freeswitch.pid
FS_HOME=/var/run/freeswitch-2
LOCK_FILE=/var/lock/subsys/freeswitch-2
FREESWITCH_ARGS="-nc -base /usr/local/freeswitch-2/ -run /usr/local/freeswitch-2/run/ -conf /usr/local/freeswitch-2/conf/ -log /usr/local/freeswitch-2/log -db /usr/local/freeswitch-2/db "
Extra arguments are required, since I did not recompile the binaries for secondary instance. Sysconfig for primary instance is unchanged.

2 comments:

Anthony Minessale II said...

That error usually means you are running a 32 bit version of the OS. Try using a 64 bit environment or you may be able to reduce the overhead if you start FS from a shell that has executed "ulimit -s 244" to lower the stack size so it does not cost 12m of ram per thread.

FreeSWITCH runs way better in 64 bit.

Unknown said...

Thank you Anthony!
My environment is indeed 64-bit, and freeswitch binary is 64-bit too.
One of the limitations of running freeswitch as a daemon is that you don't see process output. Some time ago I launched freeswitch in console mode, and saw the message about ulimit, but "uto-adjusting stack size for optimal performance...", and the code further setrlimit(RLIMIT_STACK.., made me think that it's not an issue.
After splitting the load into two instances, I also set /etc/security/limits.conf:
freeswitch hard stack 10240
freeswitch soft stack 240

With both changes in place, I did not experience the crash so far, but I'm not sure what helped.