tomcat

Issues with Tomcat7 on RHEL7

On RHEL7, up until tomcat v7.0.59 Redhat maintained backwards compatibility with RHEL6 structures.  Once you upgrade to tomcat 7.0.70-x all that changes.  Two things specifically:

First:
One big change is the use of $CATALINA_OPTS, if you have multiple lines defining various $CATALINA_OPT properties your installation will break.  Once you upgrade you must specify all of your $CATALINA_OPTS on a single line.  This has to do with the use of systemd, regardless all on one line or it won’t work.

Second:
Ensure tomcat7 is logging to /var/log/tomcat7/catalina.out

Create a new file /etc/rsyslog.d/tomcat.conf containing the next two lines:

programname,contains,"server" /var/log/tomcat/catalina.out
programname,contains,"server" ~

Afterwards, restart the rsyslog daemon:

service rsyslog restart

Those are the only issues I have found yet, I’ll update if more surface.

 

 

When Tomcat stops responding to Apache

Today our multi-node tomcat servers became unresponsive to user/web traffic.  A quick look at our monitoring tools indicated that the tomcat servers were running healthily.  While the application administrator looked at catalina.out to see if we were missing something, I dug into the load balancer logs.  I immediately saw the following errors:

[Date] [error] ajp_read_header: ajp_ilink_receive failed
[Date] [error] (70007)The timeout specified has expired: proxy: read response failed from IP_ADDRESS:8009 (HOSTNAME)
[Date] [error] proxy: BALANCER: (balancer://envCluster). All workers are in error state for route (nodeName)

So we understood the problem, next we needed to understand why and how to fix it.  The AJP documentation confirmed that the default AJP connection pool is configured with a size of 200 and an accept count (request queue when all connections are busy) of 100. We confirmed that these matched our tomcat AJP settings.  Increasing the MaxClients setting in Apache’s configuration and a quick restart put us back in business.

Note:

Examining logs we could see that today there was a marked increase in testing activity that exposed this problem.  A further read of the Tomcat AJP documentation revealed that the connections remain open indefinitely until the client closes them unless the ‘keepAliveTimeout’ is set on the AJP connection pool.  Ideally, the AJP connections should grow as load increases and then reduce back to an optimal number when load decreases. The ‘keepAliveTimeout’ has the effect of closing all connections that are inactive.  Our keepAliveTimeout settings were set and working but I thought I should include that information here since if we didn’t have that setting this problem would most likely have manifested much earlier than it did.

Solution:

Configure Apache ‘MaxClients’ to be equal to the total number of Tomcat AJP ‘maxConnections’ accross all nodes.

This was already set, however you will also want to make sure you configure Tomcat AJP ‘keepAliveTimeout’ to close connections after a period of inactivity.

References:
Tomcat AJP: http://tomcat.apache.org/tomcat-7.0-doc/config/ajp.html
Apache MPM Worker: http://httpd.apache.org/docs/2.2/mod/worker.html

 

Apache-Tomcat VHost redirection

OK this isn’t rocket science however I thought it worth documenting since I will probably forget in 6 months to a year when asked to do this again.

The Situation: Tomcat running with an Apache front-end using AJP to pass all traffic through to Tomcat after authenticating against CAS.

The Problem: The tomcat application did not exist in the root context so traffic needed to be forwarded to DOMAIN/sub-dir using HTTPS to insure data is secure.  We were simply forwarding all HTTP traffic to HTTPS and forwarding any URL with DOMAIN/sub-dir in the path.  That meant anyone going to DOMAIN/ was not being redirected to the application.

Where we were:

#/etc/httpd/conf/httpd.conf

<VirtualHost *:80>
                Redirect / https://DOMAIN/SUB-DIR
#/etc/httpd/conf.d/ssl/conf

<Location /SUB-DIR>
       ProxyPass ajp://localhost:8009/SUB-DIR
       ProxyPassReverse  ajp://localhost:8009/SUB-DIR
</Location>

For a reason I don’t have the details for (a change on the tomcat application side) this stopped working.  Following CAS authentication the user was being returned to HTTPS://DOMAIN/SUB-DIRSUB-DIR which of course didn’t work.  Since the application was now configured as desired I needed to fix the rewrite/redirection issue.

Before I get to the solution.  For all previous cases we had a consulting firm working with us, they would simply put a redirection statement in tomcat root context. Not really a great idea but hey I don’t get paid the big bucks as a consultant so what do I know!

The Solution:

First to handle all HTTP traffic:

#/etc/httpd/conf/httpd.conf
<VirtualHost *:80>
                Redirect / https://jenkins.uits.uconn.edu/
</VirtualHost>

Now to handle the secure HTTPD traffic.  My first attempt (without thinking) was to do this:

#/etc/httpd/conf.d/ssl.conf

<Location />
       ProxyPass ajp://localhost:8009/SUB-DIR
       ProxyPassReverse  ajp://localhost:8009/SUB-DIR
</Location>

<Location /SUB-DIR>
       ProxyPass ajp://localhost:8009/SUB-DIR
       ProxyPassReverse  ajp://localhost:8009/SUB-DIR
</Location>

This of course did not work because Apache was never reaching the /SUB-DIR test!  So a quick cut and paste and I had this:

#/etc/httpd/conf.d/ssl.conf

<Location /SUB-DIR>
       ProxyPass ajp://localhost:8009/SUB-DIR
       ProxyPassReverse  ajp://localhost:8009/SUB-DIR
</Location>

<Location />
       ProxyPass ajp://localhost:8009/SUB-DIR
       ProxyPassReverse  ajp://localhost:8009/SUB-DIR
</Location>

This works.  It is clean and quick the way it is supposed to be.  To recap

memcached

In support of the Kuali project.

Setting up true fail over for the Kuali application servers.  Currently if a node went down, the user would need to re-authenticate.  The following procedure configures the system so it can lose a node and the users on that node will not lose their session.

My part on the system side was fairly straightforward:

yum install memcached
iptables -I INPUT -m state --state NEW -m tcp -p tcp --dport 11211 -j ACCEPT
service iptables save
chkconfig memcached on
service memcached start

With that configured the work to enable tomcat to leverage memcached can begin:

Parts of the following information was found at (www.bradchen.com)

Download the most recent copy of the following jars (links provided) and install them to the tomcat_dir/lib directory:

For each jar, open tomcat_dir/conf/context.xml, and add the following lines inside the <Context> tag:

<Manager className="de.javakaffee.web.msm.MemcachedBackupSessionManager"
    memcachedNodes="n1:localhost:11211"
    requestUriIgnorePattern=".*.(ico|png|gif|jpg|css|js)$" />

If memcached is listening on a different port, change the value in memcachedNodes.  port 11211 is the default port for memcached.

Open tomcat_dir/conf/server.xml, look for the following lines:

<Server port="8005" ...>
    ...
    <Connector port="8080" protocol="HTTP/1.1" ...>
    ...
    <Connector port="8009" protocol="AJP/1.3" ...>

Change the ports, so the two installations listen to different ports. This is optional, but I would also disable the HTTP/1.1 connector by commenting out its <Connector> tag, as the setup documented here only requires the AJP connector to be enabled.

Finally, look for this line, also in tomcat_dir/conf/server.xml:

<Engine name="Catalina" defaultHost="localhost" ...>

Add the jvmRoute property, and assign it a value, that is different between the two installations. For example:

<Engine name="Catalina" defaultHost="localhost" jvmRoute="jvm1" ...>

And, for the second instance:

<Engine name="Catalina" defaultHost="localhost" jvmRoute="jvm2" ...>

That’s it for Tomcat configuration. This configuration uses memcached-session-manager’s default serialization strategy and enables sticky session support. For more configuration options, refer to the links in the references section.

In our apache load balancer we add the following definition:

ProxyPass /REFpath balancer://Cluster_Name
ProxyPassReverse /REFpath balancer://Cluster_Name

<Proxy balancer://Cluster_Name>
   BalancerMember ajp://HOSTNAME:8009/REFpath route=jvm1  timeout=600 min=10 max=100 ttl=60 retry=120 connectiontimeout=10
   BalancerMember ajp://HOSTNAME:8009/REFpath route=jvm2  timeout=600 min=10 max=100 ttl=60 retry=120 connectiontimeout=10
   BalancerMember ajp://HOSTNAME:8009/REFpath route=jvm3  timeout=600 min=10 max=100 ttl=60 retry=120 connectiontimeout=10
   BalancerMember ajp://HOSTNAME:8009/REFpath route=jvm4  timeout=600 min=10 max=100 ttl=60 retry=120 connectiontimeout=10
   ProxySet lbmethod=byrequests
   ProxySet stickysession=JSESSIONID|jsessionid
   ProxySet nofailover=On
</Proxy

Note that the BalancerMember lines point to the ports and jvmRoutes configured above.  This sets up a load balancer that dispatches web requests to multiple Tomcat installations. When one of the Tomcat instance gets shutdown, requests will be served by the other one that is still up. As a result, user does not experience downtime when one of the Tomcat instances is taken down for maintenance or application redeployment.

This step also sets up sticky session. What this means is that, if user begins session with instance 1, she would be served by instance 1 throughout the entire session, unless of course this instance goes down. This can be beneficial in a clustered environment, as application servers can use session data stored locally without contacting a remote memcached.

CASify PSI-Probe

PSI-Probe is an extended Tomcat manager.  This post outlines how to place it behind CAS and specify permissions based on an LDAP attribute.  The method outlined in this post has been adapted from documentation from Jasig on Tomcat Container Authentication, written by Marvin S. Addison.

Although the current stable build of PSI-Probe cannot be placed behind CAS (the role names are hard coded into the program), the current source code allows for flexible role names.

Check out the source code (SVN). The command can be found on psi-probe’s website.  A readme is included in the source, follow the instructions found there (an Oracle driver must be imported into Maven to properly build psi-probe).  The readme explains how to package the source into a .war which can then be deployed on the server.

The process of modifying the configuration files to support CAS authentication can either be done in the source code before building the .war or afterward on the server (by modifying the unpacked files).  My preference, for the sake of flexibility, is to put in a place holder for the attribute names (role names) before building the .war, then use a script to string replace the placeholder with the attribute name within the modified configuration files on the server.

Before modifying any configurations, you must obtain several .jar files and place them in your $TOMCAT_HOME/lib directory.

Found in CAS-Client under modules (can be downloaded from Jasig) :

  • cas-client-core-$VERSION.jar
  • cas-client-integration-tomcat-common-$VERSION.jar
  • cas-client-integration-tomcat-v6-$VERSION.jar
  • commons-logging-$VERSION.jar
  • xmlsec-$VERSION.jar
  • commons-codec-$VERSION.jar

Can be obtained from Apache and OpenSAML

  • log4j-$VERSION.jar
  • opensaml-1.1b.jar

The first file that must me modified is the context.xml.  It can be found at the following locations:

Source: $SOURCE_DIR/web/src/main/conf/META-INF/context.xml
Server: $TOMCAT_HOME/conf/Catalina/localhost/probe.xml

The file should read:

<?xml version="1.0" encoding="UTF-8"?>
<Context path="/probe" privileged="true" >
<!--
 The following configuration uses the SAML 1.1 protocol and role data
 provided by the assertion to enable dynamic server-driven role data.
 The attribute used for role data is "memberOf".
 -->
<Realm
 className="org.jasig.cas.client.tomcat.v6.AssertionCasRealm"
 roleAttributeName="memberOf"
 />
 <Valve
 className="org.jasig.cas.client.tomcat.v6.Saml11Authenticator"
 encoding="UTF-8"
 casServerLoginUrl="https://login.example.com/cas/login"
 casServerUrlPrefix="https://login.example.com/cas/"
 serverName="your.server.example.com"
 />
<!-- Single sign-out support -->
<Valve
 className="org.jasig.cas.client.tomcat.v6.SingleSignOutValve"
 artifactParameterName="SAMLart"
 />
 </Context>

The attribute does not have to be memberOf, any attribute name can be specified here.

Next the roles have to be specified in the application.  This is done in spring-probe-security.xml

Source: $SOURCE_DIR/web/src/main/webapp/WEB-INF/spring-probe-security.xml
Server: $TOMCAT_HOME/webapps/probe/WEB-INF/spring-probe-security.xml

The following section should be modified to read as outlined below.  ROLE_ must be followed by the name of the attribute or entitlement you set up to grants permission.

<sec:filter-invocation-definition-source>
     <sec:intercept-url pattern="/adm/**" access="ROLE_ATTRIBUTE"/>
     <sec:intercept-url pattern="/sql/**,/adm/restartvm.ajax" access="ROLE_ATTRIBUTE"/>
     <sec:intercept-url pattern="/app/**" access="ROLE_ATTRIBUTE"/>
     <sec:intercept-url pattern="/**" access="ROLE_ATTRIBUTE"/>
</sec:filter-invocation-definition-source>

Finally, the web.xml must be modified to properly filter access.

Source: $SOURCE_DIR/web/src/main/webapp/WEB-INF/web.xml
Server: $TOMCAT_HOME/webapps/probe/WEB-INF/web.xml
<context-param>
     <description>Role that can view session attribute values</description>
     <param-name>attribute.value.roles</param-name>
     <param-value>ROLE_ATTRIBUTE</param-value>
</context-param>
<auth-constraint>
     <role-name>$attribute_name</role-name>
</auth-constraint>
<security-role>
     <role-name>$attribute_name</role-name>
</security-role>

You will notice that there are several entries for ROLE and security-role within web.xml.  These provide levels of access, which can be specified (you will have to modify spring-probe-security.xml to reflect permission levels set in web.xml).  If you don’t deem them necessary, you can remove them completely from web.xml and simply leave a single entry (as shown above).

July 1 2012 Linux problems? High CPU/Load? Probably caused by the Leap Second!

(Update posted, see below)

As posted in multiple places around the web:

Debian

/etc/init.d/ntp stop
date `date +"%m%d%H%M%C%y.%S"`

Red Hat

/etc/init.d/ntpd stop
date `date +"%m%d%H%M%C%y.%S"`

Update:

This first manifested itself for us in our Java stacks — all of our (dual processor) Tomcat servers were running at a load of 30-40.  However, this is a known (and fixed) kernel bug:

https://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6b43ae8a619d17c4935c3320d2ef9e92bdeed05d

Apparently, simply forcing a reset of the date is enough to fix the problem:

date -s "`date`"