Thursday, December 2, 2010

vchanger, Bacula, and "ERR=Unknown error during program execvp"

 After upgrading Bacula to 5.0.3, I started getting an error message when attempting to load a tape:


ERR=Unknown error during program execvp


I suspected it was a problem with vchanger, as it couldn't load/unload tapes, either. After looking in the disk that vchanger happened to be using, I discovered that drive0 obviously had a tape in it, but the file loaded0 was empty. I determined which "tape slot" was empty, added the name of the slot to the file loaded0, and unloaded it. Everything worked after that.


Wednesday, December 1, 2010

Problems Building RPMs for Bacula 5.0.3

So, after starting to build the RPMs from source for Bacula 5.0.3, I discovered several errors.


1. File /home/user1/rpm/SOURCES/bacula-5.0.2.tar.gz: No such file or directory

 Easily fixed, but I ran into additional errors:

2. error: Installed (but unpackaged) file(s) found:
   /usr/lib/bacula/btraceback.mdb

 Also easily fixed... Alas! More problems:

3. RPM build errors:
    Bad exit status from /home/user/rpm/tmp/rpm-tmp.26291 (%doc)
    File listed twice: /usr/lib/libbaccfg-5.0.3.so
    File listed twice: /usr/lib/libbaccfg.la
    File listed twice: /usr/lib/libbaccfg.so
    File listed twice: /usr/lib/libbacfind-5.0.3.so
    File listed twice: /usr/lib/libbacfind.la
    File listed twice: /usr/lib/libbacfind.so
    File listed twice: /usr/lib/libbacpy-5.0.3.so
    File listed twice: /usr/lib/libbacpy.la
    File listed twice: /usr/lib/libbacpy.so

Around line 1422 or so in the bacula.spec file, you'll see:

%files libs
%defattr(-,root,root)
%{_libdir}/libbac*
%{_libdir}/libbaccfg*
%{_libdir}/libbacfind*
%{_libdir}/libbacpy*


It appears that we don't need all those wildcards... the first wildcard encompasses the remainder - so I removed every line except for the line ending in /libbac*.

4. One final error: the spec file needs for there to be a Release_Notes-5.0.3-1.tar.gz with the file Release_Notes-5.0.3-1.txt in the SOURCES directory. After adding that, the Bacula RPMs built correctly.

Here's the diff of the bacula.spec file:

--- bacula.spec 2010-12-01 10:47:37.000000000 -0500
+++ rpm/SPECS/bacula.spec       2010-12-01 10:57:20.000000000 -0500
@@ -6,7 +6,7 @@

 # basic defines for every build
 %define _release           1
-%define _version           5.0.2
+%define _version           5.0.3
 %define _packager D. Scott Barninger
 %define depkgs_version 18Dec09

@@ -1356,6 +1356,7 @@

 %{_sbindir}/bacula-fd
 %{_sbindir}/btraceback
+%attr(-, root, %{daemon_group}) %{script_dir}/btraceback.mdb
 %attr(-, root, %{daemon_group}) %{script_dir}/btraceback.gdb
 %attr(-, root, %{daemon_group}) %{script_dir}/btraceback.dbx
 %{_sbindir}/bconsole
@@ -1422,9 +1423,6 @@
 %files libs
 %defattr(-,root,root)
 %{_libdir}/libbac*
-%{_libdir}/libbaccfg*
-%{_libdir}/libbacfind*
-%{_libdir}/libbacpy*

 %post libs
 /sbin/ldconfig











Monday, November 15, 2010

Bacula - Exluding Certain Subdirectories Using Regular Expressions

I was asked by a former colleague how to do the following in Bacula:

Backup all the sub directories in /data/users/, but exclude /data/users/*/scratch.

Unfortunately, Bacula is slightly less than intuitive on this matter. Simply adding the directory and adding an exclusion later won't work correctly. You need to exclude the directories, first, and then add the directory.

Here is the pattern that works (in the FileSet for the given host - you can ignore the large list of options I have in the set; the first block of options are not necessary for the selective exclusion)

FileSet {
  Name = "file_server set"
          Include {
    Options {
      signature = SHA1
        compression=GZIP5
        onefs=yes
        noatime=yes
        hardlinks=yes
        sparse=yes
        ignore case=no
        checkfilechanges=yes
    }

   /usr
   /etc

   /var
    Options {

         RegexDir = "^/data/users/.*/scratch"
        exclude = yes
    }
   /data/users
  }
}


As you can see, the exclusion needs to happen before the directory is selected. It appears to be a "first match" wins selection method.

Of course, you should test this yourself to make sure it works and doesn't negatively affect your backups.

Sunday, October 31, 2010

FreeNAC - VLAN access management on Cisco Switches

I started playing around with FreeNAC recently in conjunction with several Cisco 3500XL switches. I tried playing around with the virtual machine they provided. It needs an upgrade on the FreeNAC source to the latest code in order for the windows GUI to work (there are schema errors unless you upgrade the FreeNAC source.)
    I was interrupted in the testing. I'll post more when I have results... most likely tips.

Thursday, September 30, 2010

CentOS 5.5, Samba/Winbind, Windows 2008R2 Active Directory

In order to facilitate client backups, I set up samba on CentOS on a Windows 2008R2 based domain. I created an empty directory, /etc/skel2, as I was not planning on letting users log in via ssh or the console.

 Unfortunately, the samba 3.0.x line wouldn't work correctly. I could join the domain, but not connect to shares. Samba logged the following message every connection attempt:


  read_data: read failure for 4 bytes to client 192.168.70.23. Error = Connection reset by peer

I upgraded to the Samba3 package with yum (Samba 3.3.x) and rejoined. I added a pam.d entry in the samba config:

 session required pam_mkhomedir.so skel=/etc/skel2 umask=0077


( I don't want users to see each other's directories. I did not modify system-auth, as I did not want the users to log in with anything but samba.)

And added this to smb.conf:


  winbind separator = \
  # use uids from 10000 to 20000 for domain users
  idmap uid = 10000-20000
 # use gids from 10000 to 20000 for domain groups
 idmap gid = 10000-20000
 # allow enumeration of winbind users and groups
 winbind enum users = yes
 winbind enum groups = yes
 winbind use default domain = yes
 # give winbind users a real shell (only needed if they have telnet access)
 template homedir = /data/clientdata/backups/%D/%U
 template shell = /bin/bash

 obey pam restrictions = yes

An important note, the samba3 rpm separates the smbd and nmbd init scripts, so you'll need to do a separate "chkconfig nmb on."

Friday, August 27, 2010

Upgrading to ESXi 4.1 from 4.0.

I recently needed to update an ESXi 4.0 host to ESXi 4.1. After discovering that the host update utility was not supported with ESXi 4.1, I resorted to my other option, the vSphere CLI.
Here's the session (on Windows 7 64bit):

 C:\Program Files (x86)\VMware\VMware vSphere CLI>bin\vihostupdate.pl --server my_server -b upgrade-from-ESXi4.0-to-4.1.0-0.0.260247-release.zip -i

The output:


 Please wait patch installation is in progress ...
 The update completed successfully, but the system needs to be rebooted for the changes to be effective.


Followed by a reboot:


 C:\Program Files (x86)\VMware\VMware vSphere CLI> bin\vicfg-hostops.pl --server my_server -o reboot

 Host my_server rebooted successfully.

Wednesday, August 11, 2010

Netgear GSM Switches and LAGs/Port channels

Setting up port-channels/lags is pretty easy on Netgear GSM switches. In this example, we are assuming that we are connecting two Netgear GSM7324 switches, and adding vlan 10 to them. We'll link together ports 23 and 24 on both switches.

1. create the interface:

configure
port-channel lag_01

2. assign it to two ports:
configure
interface range 0/23-0/24
addport 1

3. Allow VLAN 10 and 1 to travel over it. Require them both to be tagged:
configure
interface lag 1
vlan participation include 1
vlan tagging 1
vlan participation include 10
vlan tagging 10

Friday, July 30, 2010

New Console Game Out...

This is a fake game cover I made for a friend. Not computer related, but since no one reads this blog anyway, who cares?




Photobucket

Wednesday, June 23, 2010

Trouble sending out email from Bacula 5.0.2 and CentOS 5.5

I recently noticed that Bacula installations on CentOS 5.x (primarily 5.5) boxes were not sending out email after job completion. It was baffling as bsmtp worked fine on the command line... at least when I ran it as root. I tried running it as the bacula user (my Bacula director runs as the Bacula user, not root) and it failed:

$ su - bacula -s "/usr/sbin/bsmtp"
-bsmtp: error while loading shared libraries: libbac-5.0.1.so: cannot open shared object file: Permission denied


Ah... I checked the file permissions:

ls -la /usr/lib/libbac-5.0.1.so
-rwxr-x--- 1 root root 330240 Jun 22 12:46 /usr/lib/libbac-5.0.1.so

Simple fix:

chmod o+r /usr/lib/libbac-5.0.1.so

On 64 bit CentOS/Rhel, it will be /usr/lib64/libbac-50.0.1.so

I'm guessing that there was a problem in the spec file (I built it from an srpm.)

Sunday, May 23, 2010

Bacula Windows Client and error 1067

I recently ran into an error starting the bacula-fd service on a windows client: error 1067. I looked it up... several people had this error, but no resolution. I looked at the application log on the windows client, but there was no bacula message.

To figure it out, I started the bacula-fd.exe file by hand from a command prompt. It gave the full error message (I had a minor typo in the bacula-fd.conf file.)
I corrected the typo and was able to successfully start the service.

Coincidentally, a useful one liner for bconsole on Linux:

watch -n 1 --differences "echo status 2 2 | bconsole | \
    grep 'Files=' | sed -e 's/ */ /g' | cut -f 3 -d \" \""

In this case, my storage device is #2. It produces out put like this (every second):
Every 1.0s: echo status 2 2 | bconsole | grep 'Files=' | sed -e 's/ */ /g' | cut -f 3 -d " " Sun May 23 00:45:54 2010

Bytes=1,765,498,476



Unfortunately, this won't work on FreeBSD as the watch command from ports has different syntax.

Thursday, April 29, 2010

Removing the Request Tracker (RT) Subject Prefix

I have an installation of Request Tracker that I use at work to manage helpdesk tickets, assets, projects, and other things. I recently added a support queue for the customer services people. They're using another program to actually manage the cases, but want to use RT for the auto responder.

That's not a big deal, of course. The problem was the RT subject header. You can modify the template subject as you wish, but you can never get rid of the RT name prefix:

[So and So's support queue #233] Autoreply: I need help on this problem

Since the support people are using another program to manage the cases, they didn't want to confuse users with an RT ticket number. Once the user sends in his or her email and the support team creates a case in the other package, the ticket will be closed. The requestor will receive only the autoreply from RT, and the rest of the correspondence will be through the other system.

I could not figure a way to leave off the prefix without affecting all the queue. I searched through the code and found the least invasive hack:


In the Interface::Email::SendEmail method... here's a diff:
diff -ru Email.pm /usr/local/lib/perl5/site_perl/5.8.9/RT/Interface/Email.pm
--- Email.pm 2009-10-19 15:55:31.000000000 -0400
+++ /usr/local/lib/perl5/site_perl/5.8.9/RT/Interface/Email.pm 2010-04-29 17:34:52.000000000 -0400
@@ -308,6 +308,7 @@
=cut

sub SendEmail {
+
my (%args) = (
Entity => undef,
Bounce => 0,
@@ -315,6 +316,7 @@
Transaction => undef,
@_,
);
+
foreach my $arg( qw(Entity Bounce) ) {
next unless defined $args{ lc $arg };

@@ -329,6 +331,13 @@

my $msgid = $args{'Entity'}->head->get('Message-ID') || '';
chomp $msgid;
+#### 4.29.2010 - custom hack to skip ticket number on support queue
+my $tSub = $args{'Entity'}->head->get('Subject');
+if ($tSub =~ m/^\[MyCompany's Customer Support queue #[0-9]+\] (AutoReply:.*)$/) {
+ $args{'Entity'}->head->set('Subject' => "[MyCompany's Customer Support queue] $1");
+}
+#### end custom hack
+

# If we don't have any recipients to send to, don't send a message;
unless ( $args{'Entity'}->head->get('To')


Notice that it only affects messages sent out as Autoreply (i.e., responding to a Scrip creation event.) I could have been more sophisticated and added a custom field to the queues, and modified the actual Autoreply routine to look up the value... but I consider this to be a one off, and this was the simplest modification. I suppose I didn't absolutely need a temp variable, but I put one in anyway.

Monday, April 26, 2010

OpenBSD, Dual Gateways, and redirects

I wanted to allow users to ssh to an sftp server, but using the secondary ISP connection. The SFTP server is in a DMZ (actually, in this case, it's a VLAN off an internal NIC.) This is with OpenBSD version 4.4.

#As usual, we need to set up the pf.conf file so that NATing happens on both interfaces:

nat on $ext_if from !($ext_if) to ! -> ($ext_if:0)
nat on $ext_if2 from !($ext_if2) to ! -> ($ext_if2:0)

# here's the actual redirect
rdr pass on $ext_if2 proto tcp from any to ($ext_if2:0) port 40000 \
-> $sftp port 22

# I haven't tried this with 4.6 or later... anyway, keep state appears
# to break things, as later packets go out the primary ISP connection
# ($ext_if, not $ext_if2)
pass out on $dmz3_if proto tcp from any to $sftp port 22 no state
pass in on $dmz3_if route-to ($ext_if2 $gateway2) proto tcp from \
$sftp port 22 to any no state



It works for me.

Saturday, March 20, 2010

Migrating NT 4 to Windows 2000, Sun VirtualBox, and VMware

I just performed my last Windows NT to Windows 2000 domain migration (I certainly hope it's the last time!) I was tasked with moving an NT 4 domain with a single domain controller to Active Directory. Fortunately, the company did not have Exchange (I've done NT 4 + Exchange 6.5 to Win2k3 + Exchange 2003, and it's not a fun migration.)

Here are the steps I followed, more or less:

1. Obtained Windows NT Server media, Windows 2000 server media, and Windows 2003 R2 32bit media

2. I migrated the existing NT 4 server to a VMware server 1 VM (the company only had a single VMware server running on CentOS 5.2. I used VMware converter version 3.x (version 4 does not support NT 4.)

2a. I had some time on my hands, so I simulated the whole thing with a copy of the VM I created in step 3. I did the entire migration on virtual machines before continuing on site.

2b. I created a snapshot of the virtual NT 4 PDC.

2c. I set up an additional DNS server on another box (a windows 2003 server) and made it the primary DNS server. It is possible to use Bind 9, but the client did not have enough unix infrastructure to make that practical.

3. I created a Win NT BDC as a VM on Sun VirtualBox 3.x. I created a single disk of less than 4GB (I made it 3.8GB.) The disk did not need to be huge. I was not intending for this VM to be around long term. This worked fairly well. I originally tried to create it using VMware Workstation 7.x, but had several annoying problems, so I just used VirtualBox. The NIC on the VM was bridged. I gave it a static IP and put it in DNS.

4. I promoted the VM I created in step 3 to be the PDC (using the server manager on the original NT 4 PDC.

5. I created a snapshot of the new NT 4 PDC

6. I upgraded the new PDC to Windows 2000. I ran into some problems with the VM crashing (VirtualBox crashing, to be precise.) I stopped the crashing by disabling all virtualization extensions on the VirtualBox config for the VM. At any rate, this created an Active directory installation. I pointed DNS to the win2003 server I mentioned in step 2c.

8. After the upgrade and Active directory creation, I made the new win2000 box the primary dns server for the zone. Once I had transferred the zone successfully, I changed the zone to be active directory integrated.

9. I applied Windows 2000 Service pack 4 and all relevant updates ( I had to install IE 6 sp1 to get windows update to work correctly.)

10. I ran adprep /forestprep on the new windows 2000 domain controller. Adprep was located on the second disc of Windows 2003 R2 32bit in the directory CMPNENTS\R2\ADPREP.

11. I ran adprep /domainprep and adprep /domainprep gpprep on the Windows 2000 domain

At this point, I had a Windows 2000 domain. I could safely delete the NT 4 BDC and shut it down. I left it around, as there were a few files to grab.

12. I then ran dcpromo on the Windows 2003 box I mentioned earlier, and made it an additional DC.

13. I made the windows 2003 R2 dc a global catalog server.

I haven't had time to go back and finish up (add another Windows 2003 R2 box (or even Windows 2008 R2)) yet. I verified that filesharing and login worked, of course.

Thursday, March 11, 2010

Remote Desktop Clients, 32Bit Applications, and Windows 2008 R2

I recently ran into a problem with Remote Desktop Services on Windows 2008 R2 (formerly known as Terminal Services from Windows NT 4 through Windows 2008). The problem occurs when someone configures an RDP session that automatically launches a 32 bit application on connection. As you may know, Windows 2008 R2 is 64bit only. The problem I was seeing was that there was frequently a delay after exiting the application.

What should happen is that the RD session should drop after the application closes. What often happens is that the user sees a blue screen for up to two or three minutes, then the session completely drops.



I figured out the cause by using the Remote Desktop Services Manager. I launched a client session that launched the 32bit app upon login. I then closed the application, and the session hanged. Looking in the RD Services Manager, I noticed that there was a service from the session called "splwow64.exe." I killed that process, and the session dropped.



The Fix



The fix is pretty simple. It requires a regedit. Perform this regedit at your own risk! You should backup the registry before you start.

Navigate to the registry key:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Terminal Server\SysProcs

Add the following 32 bit DWORD:
splwow64.exe

and set the value to 0

The fix worked instantly, at least for me. Your results may vary, of course. As far as I understand, you can create other dword values in there to terminate other programs that are stalling the exit of remote desktop sessions.

Apparently, splwow64.exe is used in thunking (in this context, converting memory addresses from 32 bit to 64 bit.)

Friday, March 5, 2010

Bacula 5.0

Bacula 5.0 has been out for a little while. Apparently, the developers decided to skip 4.x to differentiate the community edition from the enterprise edition. The new features list is here. I'll be setting it up soon and will likely post some sample configurations.

Friday, February 26, 2010

Linksys Managed Switches - Console Port

I had to set up a Linksys SRW2048 switch recently. It's a 48 port gigabit layer 2 switch. It's primarily web managed, but there is a limited CLI.

Anyway, the console port did not appear to be a standard pinout. I opened the RJ45 to DB9 connector they provided, and discovered it has this pinout:

Pin 1Pin 2Pin 3Pin 4Pin 5Pin 6Pin 7Pin 8Pin 9
xblackyellowbrownred (+green)orangewhitebluex


This is the same as the pinout as I used for the type "A" RJ45-DB9 connector in this article: here.

From my testing, you simply need to make two of the type "A" adapters I mentioned in the linked post, attach a cat5 cable to both ends, and you should be able to connect to the serial port. The Linksys uses 38400 baud - unlike the usual 9600.

Tuesday, January 5, 2010

Nagios, Passive Service and Host Checks

I recently had to set up a remote nagios installation that would connect back to a central server via NSCA. I followed the standard Nagios documentation for distributed monitoring here. The passive service checks worked fine.


The problem was that I wanted passive checks for hosts, too, and not just services, as the central server would not be able to ping the remote hosts directly in the event that the data is considered stale. I set up all the checks on the central server side to be the service-is-stale check - so any staleness results in an alert, "the check data is stale." Unfortunately, the nagios documentation is a little vague about the ohcp command (the final paragraph on the nagios link above.) I couldn't find any real answer on passive host checks on the web, either.




Here's what I did on the remote side (and not the central collector server.) I copied the submit_check_result script in the documentation, modified it, and saved it as /etc/nagios/bin/submit_host_result. The final version:

# Arguments:
# $1 = host_name (Short name of host that the service is
# associated with)
# $2 = host check output (0, 1, 2, etc.)
# $3 = plugin_output (A text string that should be used
# as the plugin output for the service checks)
#
# Convert the state string to the corresponding return code
central_server=ip.address.or.hostname.of.your.central.nsca.server

/usr/bin/printf "%s\t%s\t%s\n" "$1" "$2" "$3" | /usr/sbin/send_nsca -H $central_server -c /etc/nagios/send_nsca.cfg


I then added the following entry to the command definition file:

define command{
command_name submit_host_result
command_line /etc/nagios/bin/submit_host_result $HOSTNAME$ $HOSTSTATEID$ '$HOSTOUTPUT$'


}

I then modified the nagios.cfg file like so:

obsess_over_hosts=1
# OBSESSIVE COMPULSIVE HOST PROCESSOR COMMAND
# This is the command that is run for every host check that is
# processed by Nagios. This command is executed only if the
# obsess_over_hosts option (above) is set to 1. The command
# argument is the short name of a command definition that you
# define in your host configuration file. Read the HTML docs for
# more information on implementing distributed monitoring.
ochp_command=submit_host_result

Of course, it took a bit of work to figure that out. So, the end result is that both service and host checks are passive on the central server. You might want to make the remote server the parent of all the other remote servers as if it's down, inaccessible, there's no way you'll receive check data for the other hosts, and you'll probably get some unnecessary alerts. I'm sure I'll see some more issues, and will likely post again on this issue.