I recently found that a Tomcat server was running out of memory, I did give the service more, but I believe the problem to be a memory leak, so more memory is only a short term solution.
Fixing the memory leak is obviously the best solution, but I’m not convinced that I’m going to be able to do it as I believe the leak is in a third party api library. So restarting the Tomcat service every night is my best immediate solution.
I created the below script to stop, then restart my Tomcat Windows service. It has an extra feature to kill the process, because I found that at times it seemed to hang on Stopping.
Note this script is designed to run in Jenkins, hence things like ping instead of timeout.
echo OFF
REM This script restarts the Tomcat service on the remote server.
REM There is an extra step to kill the process because for some reason the Tomecat service was not stopping
REM It was designed to be run inside Jenkins (hence ping instead of timeout)
REM parameters - you could pass them in if you want to
set servername=ServerName
set servicename=TomcatServiceName
echo Attempting to restart the service %servicename% on %servername%
REM Stop the Service
echo Stopping the service…
SC \%servername% stop %servicename%
REM set a counter
set /A counter=0
:WaitForStopped
REM wait for the service to stop
REM if the service doesnt stop after 5 attempts (30 seconds) then kill the process
if %counter% gtr 5 goto KillProcess
set /A counter = %counter% + 1
REM ping nowhere instead of a timeout
ping 127.0.0.1 -n 6 >nul
for /f "tokens=4" %%s in ('SC \%servername% query %servicename% ^| find "STATE"') do if NOT "%%s"=="STOPPED" goto WaitForStopped
echo Service has now stopped!
REM start the service
echo Starting the service…
SC \%servername% start %servicename%
:WaitForRunning
REM wait for the service to start
ping 127.0.0.1 -n 6 >nul
for /f "tokens=4" %%s in ('SC \%servername% query %servicename% ^| find "STATE"') do if NOT "%%s"=="RUNNING" goto WaitForRunning
echo Service is now running!
REM I return success because by the time it reaches here everything must have worked.
exit 0
:KillProcess
echo Attempting to kill the process…
for /f "tokens=3" %%s in ('SC \%servername% queryex %servicename% ^| find "PID"') do set pid="%%s"
taskkill /S %servername% /PID %pid%
REM Reset counter to zero
set /A counter = 0
goto WaitForStopped
Remarks
- The main goal of the script is to ensure the service gets back up and running.
- After stopping and starting the service, I query the status and wait until it is started (or stopped) before continuing. Jenkins has a timeout option (Abort the build if it’s stuck) which will then send out a failure notification.
- If it is taking too long to stop (30 seconds) I then query for the process ID and kill the process.
- The ping is a workaround, because timeout doesn’t work in Jenkins. It pings the localhost which gets an instant response, but there is 1 second between pings, so 6 pings = 5 seconds.