Sunday, December 21, 2008

Upgrading Server to Ubuntu Hardy Heron LTS from Dapper Drake LTS

My company hosts several rails applications. For the ones in high demand - we use mongrel_cluster with nginx. The only problem is ... we use apache for everything else. So we proxy pass requests into nginx from apache. That seemed so redundant that I decided to get rid of nginx and use mod_proxy_balancer instead.

On 6.06 this turned out to be much harder than it seemed. Essentially did not exist in /usr/lib/apache2/modules .. I would have to compile it with apxs to get it into the installation. I found out that apache 2.2 came with proxy_balancer but when I tried to update the apache package ubuntu said it was already the newest version. I knew this meant I may have to consider an upgrade to the next LTS. Beyond using mod_proxy_balancer I had been trying to get "Phusion Passenger" to work for over a month. (I had to become very familiar with httpd.h and mod_passenger.c to get it to even compile). As of that point I still had no way of serving up rails applications from apache without using Proxy Pass.

It was late on Saturday night and I had the whole weekend to fix anything that broke so I felt pretty confident that everything should be fine.

I did the commands.

#sudo su
#aptitude update
#aptitude upgrade
#aptitude dist-upgrade
#aptitude install update-manager-core

The upgrade was to be 287mb and take several hours. I pressed the "y" key and started browsing reddit on my laptop.
Through the installation I was asked what to do about configuration file conflicts between packages and my own custom versions. There were many times where I honestly didn't care because I didn't even know certain things were still installed. ldap.conf? hylafax.conf? I mean I played around with them .. thought I uninstalled those things. There were several obvious cases where I just kept my existing configs (my.cnf, apache2.cnf, php.ini etc)

The upgrade completed with an error message about /etc/fstab.pre-uuid already existing. I disregarded the error after googling the message for 10 minutes and finding nothing. Everything seemed fine.

I was delighted to finally get phusion passenger working and mod_balancer active. I took the liberty of installing about 10-15 packages I had experimented with but had no further use for. hylafax, bugzilla, otrs, auth-ldap-client etc... then I went home

The fallout

Later that night I went to show off some of the performance benchmarks to a friend and caught a page hanging. I pulled up my ssh terminal and tried to get in to see what was going on. I Couldn't get in! ! .

The next day I went on site to get on the server directly and see if I could get in. I entered every login and password I knew and it wouldn't even accept my username!. I followed instructions for manually resetting the passwords by going into recovery mode. I restarted the machine... none of the logins were checking out. I restarted again and looked at auth.log

Dec 21 06:36:55 www nscd: nss_ldap: reconnecting to LDAP server (sleeping 1 seconds)...
Dec 21 06:36:56 www nscd: nss_ldap: could not connect to any LDAP server as (null) - Can't contact LDAP server
Dec 21 06:36:56 www nscd: nss_ldap: failed to bind to LDAP server ldap:// Can't contact LDAP server
Dec 21 06:36:56 www nscd: nss_ldap: could not search LDAP server - Server is unavailable
Dec 21 06:37:01 www CRON[9390]: PAM unable to dlopen(/lib/security/
Dec 21 06:37:01 www CRON[9390]: PAM [error: /lib/security/ cannot open shared object file: No such file or directory]
Dec 21 06:37:01 www CRON[9390]: PAM adding faulty module: /lib/security/
Dec 21 06:37:01 www CRON[9390]: pam_unix(cron:session): session opened for user root by (uid=0)
Dec 21 06:37:01 www CRON[9392]: PAM unable to dlopen(/lib/security/
Dec 21 06:37:01 www CRON[9392]: PAM [error: /lib/security/ cannot open shared object file: No such file or directory]

It hit me like a ton of bricks. At one point we had another IT guy here who wanted to use ActiveDirectory to manage the users. I hated windows and microsoft for a variety of reasons and wanted to prove to him that I could provide a much easier to use system using linux and phpldapadmin. I installed LDAP ... integrated it into the system and got it running - and we never used it. Now I've removed auth-ldap-client and the authentication client depends on ldap to check if the user is in ldap.

I looked at /etc/pam.d/ and /etc/nsswitch.conf .. where I found references to ldap in /etc .. I also found them in /etc/auth-client-config .. I read up on auth-client-config and found out that it can be used to control nsswitch and pam.d/* config files with profiles. I couldn't find a pre-ldap example so i modified the kerberos example and executed auth-client-config -a -p kerberos_example from the recovery prompt. And everything worked fine after that.

So please.. If you hear about a package, a project or the next biggest thing and you must install something on your machine. Consider doing it in a sandbox VM

No comments: