Month: September 2015

Rhel6, Rhel7 Comparison

Moving from Redhat 6 to Redhat 7.  There are a *lot* of differences to get use to.  It is like having a friend come over and rearrange your entire house, including all the closets and cupboards!! You know it is your house, you just can’t seem to find any of your stuff!

Features RHEL 7 RHEL 6
Default File System XFS EXT4
Kernel Version 3.10.x-x kernel 2.6.x-x Kernel
Kernel Code Name Maipo Santiago
General Availability Date of First Major Release 2014-06-09 (Kernel Version 3.10.0-123) 2010-11-09 (Kernel Version 2.6.32-71)
First Process systemd (process ID 1) init (process ID 1)
Runlevel runlevels are called as “targets” as shown below:runlevel0.target -> poweroff.target

runlevel1.target -> rescue.target

runlevel2.target -> multi-user.target

runlevel3.target -> multi-user.target

runlevel4.target -> multi-user.target

runlevel5.target -> graphical.target

runlevel6.target -> reboot.target

/etc/systemd/system/default.target (this by default is linked to the multi-user target)

Traditional runlevels defined :runlevel 0

runlevel 1

runlevel 2

runlevel 3

runlevel 4

runlevel 5

runlevel 6

and the default runlevel would be defined in /etc/inittab file.

/etc/inittab

Host Name Change with the move to systemd, the hostname variable is defined in /etc/hostname. In Red Hat Enterprise Linux 6, the hostname variable was defined in the /etc/sysconfig/network configuration file.
Change In UID Allocation By default any new users created would get UIDs assigned starting from 1000.This could be changed in /etc/login.defs if required. Default UID assigned to users would start from 500.
This could be changed in /etc/login.defs if required.
Max Supported File Size Maximum (individual) file size = 500TBMaximum filesystem size = 500TB(This maximum file size is only on 64-bit machines. Red Hat Enterprise Linux does not support XFS on 32-bit machines.) Maximum (individual) file size = 16TBMaximum filesystem size = 16TB(This maximum file size is based on a 64-bit machine. On a 32-bit machine, the maximum files size is 8TB.)
File System Check “xfs_repair”XFS does not run a file system check at boot time. “e2fsck”File system check would gets executed at boot time.
Differences Between xfs_repair & e2fsck “xfs_repair”- Inode and inode blockmap (addressing) checks.- Inode allocation map checks.

– Inode size checks.

– Directory checks.

– Pathname checks.

– Link count checks.

– Freemap checks.

– Super block checks.

“e2fsck”- Inode, block, and size checks.- Directory structure checks.

– Directory connectivity checks.

– Reference count checks.

– Group summary info checks.

Difference Between xfs_growfs & resize2fs “xfs_growfs”xfs_growfs takes mount point as arguments. “resize2fs”resize2fs takes logical volume name as arguments.
Change In File System Structure /bin, /sbin, /lib, and /lib64 are now nested under /usr. /bin, /sbin, /lib, and /lib64 are usually under /
Boot Loader GRUB 2Supports GPT, additional firmware types, including BIOS, EFI and OpenFirmwar. Ability to boot on various file systems (xfs, ext4, ntfs, hfs+, raid, etc) GRUB 0.97
KDUMP Supports kdump on large memory based systems up to 3 TB Kdump doesn’t work properly with large RAM based systems.
System & Service Manager “Systemd”systemd is compatible with the SysV and Linux Standard Base init scripts it replaces. Upstart
Enable/Start Service the systemctl command replaces service and chkconfig.- Start Service : “systemctl start nfs-server.service”.

– Enable Service : To enable the service (example: nfs service ) to start automatically on boot : “systemctl enable nfs-server.service”.

Although one can still use the service and chkconfig commands to start/stop and enable/disable services, respectively, they

are not 100% compatible with the RHEL 7 systemctl command (according to redhat).

Using “service” command and “chkconfig” commands.- Start Service : “service start nfs” OR “/etc/init.d/nfs start”

– Enable Service : To start with specific runlevel : “chkconfig –level 3 5 nfs on”

Default Firewall “Firewalld (Dynamic Firewall)”The built-in configuration is located under the /usr/lib/firewalld directory. The configuration that you can customize is under the /etc/firewalld directory. It is not possible to use Firewalld and Iptables at the same time. But it is still possible to disable Firewalld and use Iptables as before. Iptables
Network Bonding “Team Driver”-/etc/sysconfig/network-scripts/ifcfg-team0

– DEVICE=”team0”

– DEVICETYPE=”Team”

“Bonding”-/etc/sysconfig/network-scripts/ifcfg-bond0

– DEVICE=”bond0”

Network Time Synchronization Using Chrony suite (faster time sync compared with ntpd) Using ntpd
NFS NFS4.1NFSv2 is no longer supported. Red Hat Enterprise Linux 7 supports NFSv3, NFSv4.0, and NVSv4.1 clients. NFS4
Cluster Resource Manager Pacemaker Rgmanager
Load Balancer Technology Keepalived and HAProxy Piranha
Desktop/GUI Interface GNOME3 and KDE 4.10 GNOME2
Default Database MariaDB is the default implementation of MySQL MySQL
Managing Temporary Files systemd-tmpfiles (more structured, and configurable, method to manage tmp files and directories). Using “tmpwatch”
References :-

Disk Woes

I hope to never use this document again but thought it worth documenting in case someone else has need of the information.  I powered my desktop off for a planned power outage.  When I powered it back on the system failed to boot reporting either “Error 17” or “Error 25”, in short the software raid (mirrored disks) were corrupted…  The timing of this event could not have been better.  The power outage included our data center, so I had to power over 100 systems on without my desktop!  Thank God for Live CDs!!  Following the power on there were other issues to deal with so it was almost a week before I could deal with my failed desktop.  Here is what I tried:

“sata to USB cable” since the drive was part of a raid pair this didn’t work and I didn’t waste a lot of time on it.  What it did help me discover was which disk was bad.

Knowing which disk was bad I then confirmed the failed drive using the BIOS and boot sequence on my desktop.  I confirmed it was /dev/sda that was failed.  I was able to get a replacement disk on the same size from our desktop support team.  With the new disk installed here is what I did and the results.

Boot the system to an Ubuntu Live CD

I don’t have time to add much description now but the commands and sequence should hopefully help for now.  Feel free to post a question in the comments if you have any.

sudo mdadm --query --detail /dev/md/1
sudo mdadm --assemble --scan
sudo mdadm --query --detail /dev/md/1
sudo mdadm --assemble 
sudo mdadm --assemble --scan

sudo mdadm --query --detail /dev/md/1
sudo mdadm --query --detail /dev/md/0
sudo mdadm --query --detail /dev/md/2
sudo mdadm --query --detail /dev/md/3

sudo mdadm --stop /dev/md/0
sudo mdadm --stop /dev/md/1
sudo mdadm --stop /dev/md/2
sudo mdadm --stop /dev/md/3

sudo mdadm --query --detail /dev/md/0
sudo mdadm --query --detail /dev/md/1
sudo mdadm --query --detail /dev/md/2
sudo mdadm --stop /dev/md/2
sudo mdadm --query --detail /dev/md/3
sudo mdadm --stop /dev/md/3

sudo fdisk -l

cat /proc/mdstat 
sudo mdadm --assemble --scan

cat /proc/mdstat 
sudo mount /dev/md3 /mnt
cat /proc/mdstat 
sudo mount /dev/sdb1 /mnt

sudo fdisk -l

sudo mdadm stop /dev/md/0n3

cat /proc/mdstat
sudo mdadm --manage /dev/md0 --fail /dev/sda1
sudo mdadm --manage /dev/md0 --fail /dev/sda
sudo mdadm --manage /dev/md1 --fail /dev/sda2
sudo mdadm --manage /dev/md2 --fail /dev/sda3
cat /proc/mdstat
sudo sfdisk -d /dev/sda > sda.out
sudo sfdisk -d /dev/sdb |sudo sfdisk /dev/sda
sudo sfdisk -d /dev/sda > sda.out

sudo fdisk -l
sudo mdadm --manage /dev/md0 --add /dev/sda1
sudo mdadm --manage /dev/md1 --add /dev/sda2
sudo mdadm --manage /dev/md2 --add /dev/sda3
sudo mdadm --manage /dev/md3 --add /dev/sda5
cat /proc/mdstat 
watch cat /proc/mdstat 

Every 2.0s: cat /proc/mdstat                                                                                                                                                                 Mon Aug 17 13:15:31 2015

Personalities : [raid1]
md0 : active raid1 sda1[2] sdb1[1]
      4093888 blocks super 1.1 [2/2] [UU]

md1 : active raid1 sda2[2] sdb2[1]
      819136 blocks super 1.0 [2/2] [UU]

md3 : active raid1 sda5[2] sdb5[1]
      278538048 blocks super 1.1 [2/1] [_U]
      [==============>......]  recovery = 70.4% (196127360/278538048) finish=15.0min speed=91334K/sec
      bitmap: 0/3 pages [0KB], 65536KB chunk

md2 : active raid1 sda3[2] sdb3[1]
      204668800 blocks super 1.1 [2/2] [UU]
      bitmap: 0/2 pages [0KB], 65536KB chunk

unused devices: <none>

Good Luck

 

 

 

 

Unresponsive VMware Images

Over the past week I have had two vmware images become unresponsive.  When trying to access the images via the vmware console any action reports:

rejecting I/O to offline device

A reboot fixes the problem, however for a Linux guy that isn’t exactly acceptable.  Upon digging a little deeper it appears the problem is with disk latency or more specifically a disk communication loss or time out with the SAN.  I looked at the problem with the vmware admin and we did see a latency issue.  We reported that to the storage team.  That however does not fix my problem.  What to do…  The real problem is that systems do not like I/O temporary loss of communication with their disks.  This tends to result in a kernel panic or in this case never ending I/O errors.

Since this is really a problem of latency (or traffic) there are a couple of things that can be done on the Linux system to reduce the chances of this happening while the underlying problem is addressed.

There are two things you can address, swappiness (freeing memory by writing runtime memory to disk aka swap).  The default setting it 60 out of 100, this generates a lot of I/O.  Setting swappiness to 10 works well:

vi /etc/sysctl.conf
vm.swappiness = 10

Unfortunately for me, my systems already have this setting (but I verified it) so that isn’t my culprit.

The only other setting I could think of tweaking was the disk timeout threshold.  If you check your systems timeout it is probably set to the default of 30:

cat /sys/block/sda/device/timeout
30

Increasing this value to 180 will hopefully be sufficient to help me avoid problems in the future.  You do that by adding an entry to /etc/rc.local:

vi /etc/rc.local
echo 180 > /sys/block/sda/device/timeout

I’ll see how things go and report back if I experience any more problems with I/O.

 UPDATE (24 Sep 2015):

The above setting while good to have did not resolve he issue.  Fortunately I was logged into a system when it began having the I/O errors and I was still able to perform some admin functions.  Poking around the system and digging in the system logs dmesgs at the same time led me to this vmware knowledge base article about linux 2.6 systems and disk timeouts:

I passed this on to our vmware team.  They dug deeper and determined that installing vmware tools would accomplish the same thing.  I installed vmware tools on the server and the problem went away!  It seems vmware tools hides certain disk events that linux servers are susceptible to.  There you go, hope that helps.