redhat

Flush This!

I came across this today and learned something new so thought I would share it here.

After killing 2 processes that had hung I noticed the following in the ps output:

root       373     2  0 Jun11 ?        00:00:00 [kdmflush]
root       375     2  0 Jun11 ?        00:00:00 [kdmflush]
root       863     2  0 Jun11 ?        00:00:00 [kdmflush]
root       867     2  0 Jun11 ?        00:00:00 [kdmflush]
root      1132     2  0 Jun11 ?        00:01:03 [flush-253:0]
root      1133     2  0 Jun11 ?        00:00:43 [flush-253:2]

Now kdmflush I am use to seeing, but flush-253: was something I had never noticed so I decided to dig.  I started with man flush but that seemed to lead no where since I am not running sendmail or any mail server.  I turned to google (not to proud to admit it) and searched “linux process flush”.  Turns out ‘flush-# is kernel garbage collection that flushes unused memory allocations to disk so the RAM can be reused.  So ‘flush’ is trying to write out dirty pages from virtual memory, most likely associated with the processes I just killed.

I discovered these commands that shed more light on what is actually happening:

grep 253 /proc/self/mountinfo 
20 1 253:0 / / rw,relatime - ext4 /dev/mapper/vg_kfs10-lv_root rw,seclabel,barrier=1,data=ordered
25 20 253:3 / /home rw,relatime - ext4 /dev/mapper/vg_kfs10-lv_home rw,seclabel,barrier=1,data=ordered
26 20 253:2 / /var rw,relatime - ext4 /dev/mapper/vg_kfs10-LogVol03 rw,seclabel,barrier=1,data=ordered

Remember my listings were for flush-253:0 and flush-253:2 so I now know what partitions are being worked with.  Another interesting command to use is the following, which shows the activity of writing out dirty pages:

watch grep -A 1 dirty /proc/vmstat
nr_dirty 2
nr_writeback 0

If these numbers are significantly higher you might be having a bigger problem on your system.  Though from what I have read this sometimes indicates sync’ing.  If this becomes a problem on your server you can set system parameters in /etc/sysctl.conf to head this off by adding the following lines:

vm.dirty_background_ratio = 50
vm.dirty_ratio = 80

Then (as root) execute:

# sysctl -p

The “vm.dirty_background_ratio” tells at what ratio should the linux kernel start the background task of writing out dirty pages. The above increases this setting from the default 10% to 50%.  The “vm.dirty_ratio” tells at what ratio all IO writes become synchronous, meaning that we cannot do IO calls without waiting for the underlying device to complete them (which is something you never want to happen).

I did not add these to the sysctl.conf file but thought it worth documenting.

Crontab Sudo Shenanigans

OK, here is a situation I haven’t seen in a while and it tripped me.  There I admitted it!

We have an application that requires a restart of Apache everyday (that is a different discussion).  Regardless I gave them sudo access so they could script the job to run with their process.  Obviously I thought nothing more of it, problem solved, more pressing things to do.  It worked like a charm until they put their script into cron.  They received the error:

sudo: sorry, you must have a tty to run sudo

I didn’t want to throw the baby out with the bathwater and enable tty for all of cron-dom, and I like command-line solutions over config files (less to maintain/remember).  So I tried this variation:

su --session-command="/usr/bin/sudo /sbin/service httpd restart" user_name

Slick huh?  Well of course it didn’t work because sudo is in control, pesky security controls keep me on the straight and narrow. This led me to one option, enable tty for the user (not everyone).  The solution for that is:

Defaults    requiretty
Defaults:%group_name !requiretty
Defaults:user_name !requiretty

In case that isn’t clear enough.  The first line requires TTY for all users and groups not expressly excluded from that requirement.  The second line exempts the group from the requirement and the the third line specifically exempts the user from the requirement.  The inclusion of the User_name and Group_name is redundant however this saves me revisiting the configuration file if we expand the group.

This ends the brain dump…

My MySQL Cheat Sheet

I know, man.  No, I mean I know I could use ‘man pages’!  Or I could just ‘google it’ but then it isn’t mine.  Since I do not have time for a complete brain-dump this MySql “cheat sheet” will grow over time.  Feel free to add your favorite MySql commands in the comments, if their really useful I’ll add them to the list!

If you don’t know what MySql is…look it up!  And, who are you?!  Seriously…

Create a DB & Assign to a User:

Create a New DB, Create a User and Grant them permissions to the New DB.

mysql> create database someDB_name;
Query OK, 1 row affected (0.13 sec)

mysql> create user 'someUser_name'@'localhost' IDENTIFIED BY 'some_password';
Query OK, 0 rows affected (0.13 sec)

mysql> GRANT ALL PRIVILEGES ON someDB_name.* to someUser_name@localhost;
Query OK, 0 rows affected (0.05 sec)

The above should be pretty self explanatory but for thoroughness sake…  The first line creates an empty database.  At that point only the root or admin user can use this database.  The Second command, creates a user account and assigns it a password.  This user account has NO privileges at this point.  The Third line is the most important.  When you grant permissions you can grant global permissions *.* meaning you can access ALL databases (not a good idea).  OR you can set Database permissions like I did above; database_name.*.  That .* after the database name means you have full privileges to that database only.  OR you can refine the permissions even further and grant permissions to a specific table in the database: database_name.some_table. Hope that clarifies things.  To state it in a more succinct way use this framework:

 GRANT [type of permission] ON [database_name].[table_name] TO ‘[username]’@'localhost’;

Once you have finalized the permissions that you want to set up for your new users, always be sure to reload all the privileges.

FLUSH PRIVILEGES;

Your changes will now be in effect.  I always like to test the account out before giving the account to the user.  To test out your new user, log out and log back in as the user:

mysql> quit 
mysql -u [username]-p

Revoke User Access or Delete a whole DB:

If you need to revoke a permission, the structure is almost identical to granting it:

 REVOKE [type of permission] ON [database name].[table name] TO ‘[username]’@‘localhost’;

You delete databases with DROP, you can also use DROP to delete a user altogether:

 DROP USER ‘demo’@‘localhost’;

 Recover Access when you have forgotten the root password:

Not that, that ever happens…

mysqld_safe --skip-grant-tables
mysql --user=root mysql

    update user set Password=PASSWORD('new-password') where user='root';
    flush privileges;
    exit;

That’s it for now.  More to follow…

Putting ‘lsof’ to use

lsof is a powerful tool that has proven very userful over the years in troubleshooting and forensic investigations.  Here are some useful lsof command examples:

In this example we are looking at all the files a given process has open (pid=1655 here this is the zabbix agent)

lsof -p 1767

Note you can clean up the output with something like the ‘cut’ or ‘awk’ command to focus in on the columns you are most interested in.  The output from the above command looks like this:

COMMAND    PID   USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
zabbix_ag 1767 zabbix  cwd    DIR  253,0     4096       2 /
zabbix_ag 1767 zabbix  rtd    DIR  253,0     4096       2 /
zabbix_ag 1767 zabbix  txt    REG  253,0   209432 1315973 /usr/sbin/zabbix_agentd
zabbix_ag 1767 zabbix  mem    REG  253,0   156872  917626 /lib64/ld-2.12.so
zabbix_ag 1767 zabbix  mem    REG  253,0  1922152  917633 /lib64/libc-2.12.so
zabbix_ag 1767 zabbix  mem    REG  253,0   145720  917661 /lib64/libpthread-2.12.so
zabbix_ag 1767 zabbix  mem    REG  253,0    22536  917663 /lib64/libdl-2.12.so
zabbix_ag 1767 zabbix  mem    REG  253,0    91096  917658 /lib64/libz.so.1.2.3
zabbix_ag 1767 zabbix  mem    REG  253,0   598680  917655 /lib64/libm-2.12.so
zabbix_ag 1767 zabbix  mem    REG  253,0   113952  917683 /lib64/libresolv-2.12.so
zabbix_ag 1767 zabbix  mem    REG  253,0    43392  917665 /lib64/libcrypt-2.12.so
zabbix_ag 1767 zabbix  mem    REG  253,0   386040  917664 /lib64/libfreebl3.so
zabbix_ag 1767 zabbix  mem    REG  253,0   224328 1317809 /usr/lib64/libssl3.so
zabbix_ag 1767 zabbix  mem    REG  253,0  1286744 1317807 /usr/lib64/libnss3.so
zabbix_ag 1767 zabbix  mem    REG  253,0    21256  917689 /lib64/libplc4.so
zabbix_ag 1767 zabbix  mem    REG  253,0   243096  917688 /lib64/libnspr4.so
zabbix_ag 1767 zabbix  mem    REG  253,0   177952 1317480 /usr/lib64/libnssutil3.so
zabbix_ag 1767 zabbix  mem    REG  253,0    17096  917690 /lib64/libplds4.so
zabbix_ag 1767 zabbix  mem    REG  253,0   108728 1312777 /usr/lib64/libsasl2.so.2.0.23
zabbix_ag 1767 zabbix  mem    REG  253,0   183896 1317813 /usr/lib64/libsmime3.so
zabbix_ag 1767 zabbix  mem    REG  253,0    63304  917530 /lib64/liblber-2.4.so.2.5.6
zabbix_ag 1767 zabbix  mem    REG  253,0   317168  917569 /lib64/libldap-2.4.so.2.5.6
zabbix_ag 1767 zabbix  DEL    REG    0,4                0 /SYSV6c0004c9
zabbix_ag 1767 zabbix  mem    REG  253,0    65928  917605 /lib64/libnss_files-2.12.so
zabbix_ag 1767 zabbix    0r   CHR    1,3      0t0    3662 /dev/null
zabbix_ag 1767 zabbix    1w   REG  253,2      386     120 /var/log/zabbix/zabbix_agentd.log
zabbix_ag 1767 zabbix    2w   REG  253,2      386     120 /var/log/zabbix/zabbix_agentd.log
zabbix_ag 1767 zabbix    3wW  REG  253,2        4  389438 /var/run/zabbix/zabbix_agentd.pid
zabbix_ag 1767 zabbix    4u  IPv4  13481      0t0     TCP *:zabbix-agent (LISTEN)
zabbix_ag 1767 zabbix    5u  IPv6  13482      0t0     TCP *:zabbix-agent (LISTEN)

In the above: the FD column represents the File Descriptor and Mode (Read/Write).  The 4th line from the bottom has an FD value of (2w) meaning it is open for writing, makes sense since it is a log.

The -Z option for ‘lsof’ specifies how SELinux security contexts are to be handled.  This option is only available of Linux systems that have an SELinux enabled kernel.

# lsof -Z -p 1767
COMMAND    PID SECURITY-CONTEXT                USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  cwd    DIR  253,0     4096       2 /
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  rtd    DIR  253,0     4096       2 /
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  txt    REG  253,0   209432 1315973 /usr/sbin/zabbix_agentd
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0   156872  917626 /lib64/ld-2.12.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0  1922152  917633 /lib64/libc-2.12.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0   145720  917661 /lib64/libpthread-2.12.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0    22536  917663 /lib64/libdl-2.12.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0    91096  917658 /lib64/libz.so.1.2.3
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0   598680  917655 /lib64/libm-2.12.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0   113952  917683 /lib64/libresolv-2.12.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0    43392  917665 /lib64/libcrypt-2.12.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0   386040  917664 /lib64/libfreebl3.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0   224328 1317809 /usr/lib64/libssl3.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0  1286744 1317807 /usr/lib64/libnss3.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0    21256  917689 /lib64/libplc4.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0   243096  917688 /lib64/libnspr4.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0   177952 1317480 /usr/lib64/libnssutil3.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0    17096  917690 /lib64/libplds4.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0   108728 1312777 /usr/lib64/libsasl2.so.2.0.23
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0   183896 1317813 /usr/lib64/libsmime3.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0    63304  917530 /lib64/liblber-2.4.so.2.5.6
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0   317168  917569 /lib64/libldap-2.4.so.2.5.6
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  DEL    REG    0,4                0 /SYSV6c0004c9
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix  mem    REG  253,0    65928  917605 /lib64/libnss_files-2.12.so
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix    0r   CHR    1,3      0t0    3662 /dev/null
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix    1w   REG  253,2      386     120 /var/log/zabbix/zabbix_agentd.log
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix    2w   REG  253,2      386     120 /var/log/zabbix/zabbix_agentd.log
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix    3wW  REG  253,2        4  389438 /var/run/zabbix/zabbix_agentd.pid
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix    4u  IPv4  13481      0t0     TCP *:zabbix-agent (LISTEN)
zabbix_ag 1767 system_u:system_r:initrc_t:s0 zabbix    5u  IPv6  13482      0t0     TCP *:zabbix-agent (LISTEN)

I’ll add more when I have time, comment if you want to see something specific.

 

 

 

Changing the Volume Group Name

One of the problems with cloning a system is that it has the same volume group names as the server it was cloned from.  Not a huge problem but it can limit your ability to leverage the volume group.  The fix appears easy but there is a gotcha.

RedHat provides a nice utility: vgrename

If you use that command and think you are done, you will be sorely mistaken if your root file system is on a volume group!  I’m speaking from experience there, so listen up!

If you issue the vgrename command:

# vgrename OldVG_Name NewVG_Name

it works like a charm.  If you happen to reboot your system at this point you are in big trouble… The system will Kernel Panic.

If you updated/changed the name of the VG that contains the root file system you need to modify the following two files to reflect the NewVG_Name.

  1. In /etc/fstab.   This one is obvious and I usually remember.
  2. In /etc/grub.conf.  Otherwise the kernel tries to mount the root file-system using the old volume group name.

The change is easy using ‘vi’.  Open the file in vi, then use sed from within vi; for example:

vi /etc/fstab
:%s/OldVG_Name/NewVG_Name/g
:wq

Don’t forget to save the changes.

memcached

In support of the Kuali project.

Setting up true fail over for the Kuali application servers.  Currently if a node went down, the user would need to re-authenticate.  The following procedure configures the system so it can lose a node and the users on that node will not lose their session.

My part on the system side was fairly straightforward:

yum install memcached
iptables -I INPUT -m state --state NEW -m tcp -p tcp --dport 11211 -j ACCEPT
service iptables save
chkconfig memcached on
service memcached start

With that configured the work to enable tomcat to leverage memcached can begin:

Parts of the following information was found at (www.bradchen.com)

Download the most recent copy of the following jars (links provided) and install them to the tomcat_dir/lib directory:

For each jar, open tomcat_dir/conf/context.xml, and add the following lines inside the <Context> tag:

<Manager className="de.javakaffee.web.msm.MemcachedBackupSessionManager"
    memcachedNodes="n1:localhost:11211"
    requestUriIgnorePattern=".*.(ico|png|gif|jpg|css|js)$" />

If memcached is listening on a different port, change the value in memcachedNodes.  port 11211 is the default port for memcached.

Open tomcat_dir/conf/server.xml, look for the following lines:

<Server port="8005" ...>
    ...
    <Connector port="8080" protocol="HTTP/1.1" ...>
    ...
    <Connector port="8009" protocol="AJP/1.3" ...>

Change the ports, so the two installations listen to different ports. This is optional, but I would also disable the HTTP/1.1 connector by commenting out its <Connector> tag, as the setup documented here only requires the AJP connector to be enabled.

Finally, look for this line, also in tomcat_dir/conf/server.xml:

<Engine name="Catalina" defaultHost="localhost" ...>

Add the jvmRoute property, and assign it a value, that is different between the two installations. For example:

<Engine name="Catalina" defaultHost="localhost" jvmRoute="jvm1" ...>

And, for the second instance:

<Engine name="Catalina" defaultHost="localhost" jvmRoute="jvm2" ...>

That’s it for Tomcat configuration. This configuration uses memcached-session-manager’s default serialization strategy and enables sticky session support. For more configuration options, refer to the links in the references section.

In our apache load balancer we add the following definition:

ProxyPass /REFpath balancer://Cluster_Name
ProxyPassReverse /REFpath balancer://Cluster_Name

<Proxy balancer://Cluster_Name>
   BalancerMember ajp://HOSTNAME:8009/REFpath route=jvm1  timeout=600 min=10 max=100 ttl=60 retry=120 connectiontimeout=10
   BalancerMember ajp://HOSTNAME:8009/REFpath route=jvm2  timeout=600 min=10 max=100 ttl=60 retry=120 connectiontimeout=10
   BalancerMember ajp://HOSTNAME:8009/REFpath route=jvm3  timeout=600 min=10 max=100 ttl=60 retry=120 connectiontimeout=10
   BalancerMember ajp://HOSTNAME:8009/REFpath route=jvm4  timeout=600 min=10 max=100 ttl=60 retry=120 connectiontimeout=10
   ProxySet lbmethod=byrequests
   ProxySet stickysession=JSESSIONID|jsessionid
   ProxySet nofailover=On
</Proxy

Note that the BalancerMember lines point to the ports and jvmRoutes configured above.  This sets up a load balancer that dispatches web requests to multiple Tomcat installations. When one of the Tomcat instance gets shutdown, requests will be served by the other one that is still up. As a result, user does not experience downtime when one of the Tomcat instances is taken down for maintenance or application redeployment.

This step also sets up sticky session. What this means is that, if user begins session with instance 1, she would be served by instance 1 throughout the entire session, unless of course this instance goes down. This can be beneficial in a clustered environment, as application servers can use session data stored locally without contacting a remote memcached.

Increasing the size of a filesystem

 

fdisk -l
fdisk /dev/sdc

In fdisk

c
p  (print the partition table to make sure the disk is not in use)
n (new partition)
p (primary partition)
1 (give it a number 1-4, then set start and end sectors)
w (write table to disk and exit)

Now create a physical volume, add it to the VG, extend the LV and then the file system.

pvcreate /dev/sdc1
vgextend VG_NAME /dev/sdc1
lvextend -L+5G LV_PATH (i.e.: /dev/VG_NAME/LV_NAME)
resize2fs LV_PATH
(OR if using xfs: xfs_grow LV_PATH)

Done.

Other useful commands when working with disks include:

# lsblk
NAME                             MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sr0                               11:0    1  1024M  0 rom  
sda                                8:0    0 501.1M  0 disk 
└─sda1                             8:1    0   500M  0 part /boot
sdb                                8:16   0  29.5G  0 disk 
└─sdb1                             8:17   0  29.5G  0 part 
  ├─vg_name-lv_root (dm-0) 253:0    0  40.6G  0 lvm  /
  └─vg_name-lv_swap (dm-1) 253:1    0   3.7G  0 lvm  [SWAP]
sdc                                8:32   0    20G  0 disk

The lsblk will list all block devices.  Above it is an easy way to see disks, disk usage and LVM affiliations.  Of course if you just want the block device names this will work too:

ls /sys/block/* | grep block | grep sd

 

Extended ACLs

To remove permanently ACL from a file:

# setfacl -bn file.txt

To remove permanently ACL from an entire directory:

# setfacl -b --remove-all directory.name

To overwrite permissions, setting them to rw for files and rwx for dirs

$ find . ( -type f -exec setfacl -m g:mygroup:rw '{}' ';' ) 
      -o ( -type d -exec setfacl -m g:mygroup:rwx '{}' ';' )

To set mygroup ACL permissions based on existing group permissions

$ find . ( -perm -g+x -exec setfacl -m g:mygroup:rw '{}' ';' ) 
      -o ( -exec setfacl -m g:mygroup:rwx '{}' ';' )

You’ll probably want to check that the group mask provides effective permissions. If not you can do it the old school way and run this too:

$ find . -type d -exec chmod g+rwX '{}' ';'

.

Fixing Authentication refused: bad ownership or modes for directory

When this error:

Authentication refused: bad ownership or modes for directory

Shows up in /var/log/messages

When trying to setup public key authenticated automatic logins, the problem is a permissions one.

You’ll need to perform the following commands on the user account you are trying to setup:

chmod go-w ~/
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys

X11 error on login to RedHat Servers

I noticed that since the last set of patches many of my redhat 6 systems are reporting an X11 forwarding error after login:

X11 forwarding request failed on channel 0

To correct this problem you need to install the following package

yum install xorg-x11-xauth

I have not had the time to investigate why this is suddenly a problem.  When I have time I’ll report back the why.