Rivald's Blog: 2015

Monday, December 28, 2015

Mellanox ConnectX-2 10GB Interface on FreeBSD 10.2

Although Mellanox's FreeBSD driver for the ConnectX-2 is included in the kernel source, the kernel modules are not included with the generic kernel. To use this card, build a custom kernel.

1. update the FreeBSD source using freebsd-update

sudo freebsd-update fetch
sudo freebsd-update install

2. copy the generic kernel config to a custom kernel
cd /usr/src/sys/amd64/conf

sudo cp GENERIC MYKERNEL01

4. edit the config file (sudo vi MYKERNEL01.) Add these lines to the bottom:

#### Mellanox ConnectX-2 support
options OFED
options IPOIB_CM
device ipoib
device mlx4ib
device mlxen

5. Compile the kernel and install it

cd /usr/src
sudo make buildkernel KERNCONF=MYKERNEL01
sudo make installkernel KERNCONF=MYKERNEL01

6. reboot

sudo shutdown -r now

Your new interface should show up when the machine comes back:

mlxen0: flags=8802 metric 0 mtu 1500
options=d07bb
ether 00:02:c9:52:ad:23
nd6 options=29
media: Ethernet autoselect (autoselect )
status: active

You can configure however you'd like (IP, MTU, etc.)

If you run freebsd-update on a regular basis, you may want to append a kernel rebuild (step #5) to a script so you can rebuild the kernel automatically if there is a kernel source update.

Thursday, November 19, 2015

2010 Macbook Pro and problems with Yosemite and/or El Capitan

I have had terrible issues with a mid-2010 MBP 15" core i7 laptop. Doing a clean installation of either Yosemite or El Capitan, the installer fails toward the end or crashes on first boot. I have not tried a fresh Mavericks install. However, a restore from a Mavericks time machine backup works perfectly fine. Alternatively, a linux installation also worked well with no crashes.

Quite a few posts suggest it is a log board problem and that Apple would fix it. However, being impatient, I tried my own work around. I noticed that the problem didn't happen if I reduced the RAM from 8GB to 4GB (1 stick of DDR 3 instead of 2.)

A prevailing theory for this failure is that the Nvidia 330M card was crashing the system when it switched from the onboard to the 330M.

A couple of suggested work arounds:

1. disable automatic video switching in power preferences. This will likely consume battery at a faster rate

2. use gfxCardStatus and force it to either onboard or discrete

What I did was:

1. do a fresh install of El Capitan with only 1 stick of DDR 3 (4GB)

2. fully patched the OS

3. installed gfxcardstatus and forced it to onboard only.

I haven't had a crash since.

Wednesday, May 27, 2015

Windows Netstat and findstr - Same as find | grep on Linux/Unix

Unix/Linux users are familiar with the following patterns of usage for finding listening ports:

looking for services listening on ports 443 or 444:
netstat -a | grep 44[34]

find a service listening on port 602, 612, 604, 614, 608, or 618:
netstat -an | grep 6[01][248]

To do the same on windows, use netstat and findstr:

find a window's service listening on port 602, 612, 604, 614, 608, or 618:
netstat -an | findstr /R 6[01][248]

Wednesday, April 29, 2015

Determining What Kind of SFP Is Installed in a Cisco Switch

There are several round about ways of determining what kind of SFP is installed in a socket on a Cisco switch, but I think the easiest is this:

testswitch#show int gi3/1/2 capabilities
GigabitEthernet5/1/1
Model: WS-C3750X-48
Type: 1000BaseLX SFP
Speed: 1000
Duplex: full
Trunk encap. type: 802.1Q,ISL
Trunk mode: on,off,desirable,nonegotiate
Channel: yes
Broadcast suppression: percentage(0-100)
Flowcontrol: rx-(off,on,desired),tx-(none)
Fast Start: yes
QoS scheduling: rx-(not configurable on per port basis),
tx-(4q3t) (3t: Two configurable values and one fixed.)
CoS rewrite: yes
ToS rewrite: yes
UDLD: yes
Inline power: no
SPAN: source/destination
PortSecure: yes
Dot1x: yes

You can see the type and everything. You'll need to be enable mode, of course.

Tuesday, March 24, 2015

Acme/Oracle SBC Useful Commands

To get the box-id (useful for licensing, etc.)

show version boot

To get the current amount of sessions

show sessions

To check the state of session agents

show sipd agents

To lookup a number in the local routing table
If you had created an LRT policy called inboundnums

show lrt route-entry 15555551212

To backup the config

backup name_of_backup

To verify the config

verify-config

Saturday, February 28, 2015

Using 3rd SFP's on Cisco 3560s and 3750s

It's well known that Cisco does not especially like to support third party hardware products within their switches. Perhaps one of the most common cases of this is the support for 3rd party SFPs on the Catalyst line.

During the day of GBICs, 3rd party interfaces usually worked. Enter the SFP. Here's a typical response from a 3750 when plugging in, say, a Finisar SFP:

Feb 27 00:01:47.952: %GBIC_SECURITY_CRYPT-4-VN_DATA_CRC_ERROR: GBIC in port Gi2/0/1 has bad crc
*Feb 27 00:01:47.952: %PM-4-ERR_DISABLE: gbic-invalid error detected on Gi2/0/1, putting Gi2/0/1 in err-disable state

and a show int status demonstrates this:

show int status | inc Gi
Gi2/0/1 err-disabled 1 auto auto unknown

As an unspported work around work around, you can tell the switch to ignore the fact that the SFP was not blessed by Cisco. Keep in mind that Cisco will likely tell you to put in Cisco branded SFPs and that they won't support you with 3rd party SFPs in place (i.e., this is at your own risk!)

from conf t:
service unsupported-transceiver
no errdisable detect cause gbic-invalid

You'll see this message

Warning: When Cisco determines that a fault or defect can be traced to
the use of third-party transceivers installed by a customer or reseller,
then, at Cisco's discretion, Cisco may withhold support under warranty or
a Cisco support program. In the course of providing support for a Cisco
networking product Cisco may require that the end user install Cisco
transceivers if Cisco determines that removing third-party parts will
assist Cisco in diagnosing the cause of a support issue.

If you then move the SFP to another port, you should see something like this in int staus:

Gi2/0/2 connected 1 a-full a-1000 unsupported

It works... don't forget to do a write mem.

Tuesday, January 20, 2015

RedHat Enterprise 7/CenOS 7 Firewall oneliners

Assuming you're using the default zone of "public" (you may need to temporarily disable selinux (setenforce 0)):

1. To allow everyone to access port 8080/tcp:

firewall-cmd --zone=public --add-port=8080/tcp --permanent

2. Allow a server from the IPv4 address 10.20.30.40 to access this server on port 1234 over UDP:

firewall-cmd --zone=public --add-rich-rule='rule family="ipv4" source address="10.20.30.40/32" port port="1234" protocol="udp" accept' --permanent

RedHat/CentOS error: connection activation failed: connection 'x' is not available on the device y

You'll see this error using nmcli if you try to bring up a connection in the nmcli that references a NIC that is disconnected/unplugged - either unplugged from a baremetal server or disconnected in VMware. Here are some examples of what it might look like:

connection activation failed: connection 'ethernet' is not available on the device ens32

connection activation failed: connection 'eth0' is not available on the device ens192

etc.

The fix is really as simple as connecting the cable or re-enabling the NIC in VMware and running the nmcli command again.

connection activation failed: connection '' is not available on the device

Wednesday, January 14, 2015

Simple Clustering on CentOS 7/RHEL 7 for an haproxy Load Balancer

Here are the steps to get a simple cluster going. We're not going to share storage, so the quorom isn't going to work. We'll disable stonith and quorom.

pre-req: add hosts file entries for all nodes on all nodes, or at least make sure DNS is working correctly. You might receive errors like:

Error: unable to get crm_config, is pacemaker running?

yum install pcs fence-agents-all -y
firewall-cmd --permanent --add-service=high-availability
# set the password for the hacluster user - it should probably be the same on all nodes
passwd hacluster
# disable haproxy, as the cluster will start it

systemctl disable haproxy

# enable the services
systemctl enable pcsd
systemctl enable corosync
systemctl enable pacemaker
systemctl start pcsd.service

### we're not going to have a stonith nor a quorom
pcs property set stonith-enabled=false
pcs property set no-quorum-policy=ignore

# check to see if it's alive
systemctl is-active pcsd.service
pcs cluster auth `hostname`
pcs cluster setup --start --name myclustername `hostname`
pcs cluster status
# add an IP as a resource
pcs resource create vip1 IPaddr2 ip=172.29.23.80 cidr_netmask=22 --group haproxy

to add an additional node called "mynode2"
authorize it on the master:

pcs cluster auth mynode2

(authenticate using "hacluster" user)

add it:

pcs cluster node add mynode2

You'll need to start the other node:

pcs cluster start mynode2

To see the status of the nodes:

pc status nodes

example:
pcs status nodes
Pacemaker Nodes:
Online: mynode1
Standby:
Offline: mynode2

Now we add haproxy. Since the haproxy service wouldn't be too useful without the IP address, we'll set up a colocation rule as well.

pcs resource create HAproxy --group haproxy systemd:haproxy op monitor interval=10s
pcs constraint colocation add HAproxy vip1
pcs constraint order vip1 then HAproxy

# optional - we want the cluster to "favor" mynode1.
# if mynode1 is restarted, for example, mynode2 will get the resources,
# until mynode1 is back and running
pcs constraint location vip11 prefers mynode1

We'll want to turn off haproxy in systemd as the cluster will start it:

systemctl disable haproxy

CentOS 7 error: Connection 'wired connection 1' is not available on the device ens32 at this time.

This error is presented when attempting to bring up a network connection using nmcli. The connection name can be anything, of course.

There is a RedHat bug on this titled "RHEL 7 syslog shows failure related to network.service" bug #1079353.

There is no fix listed in the bug. I've seen this on a virtual machine, and the easiest work around was to delete the virtual machine's NIC, reboot, add a new NIC - reboot. After doing that, I deleted the new network connection using nmcli and applied the new device to the original connection profile I was attempting to use.

Tuesday, January 13, 2015

RHEL/CentOS nmcli Tips

You may need to disable selinux temporarily to make changes to these files (setenforce 0)

(note that many nmcli commands will fail if the underlying device is not active (i.e., disconnected in VMware))

1. List connections

nmcli c show

2. rename an connection id called "outside" - change it to eth0

nmcli c modify outside connection.id eth0

3. change a nic (with a connection name of "ethernet" and a device name of ens32) to static and assign an address, gw, dns, etc. (172.19.22.1 is the default gateway. Separate additional addresses with commas, leaving a space before the default gateway.)

nmcli c modify ethernet connection.interface-name ens32 ipv4.method static ipv4.addresses "172.19.22.3/24 172.19.22.1" ipv4.dns 172.19.22.10,172.19.22.11 ipv4.dns-search mydomain.local

4. Bring up your new connection:

nmcli con up ethernet

5. delete a connection called "wired":

nmcli con delete wired

6. create a new connection (called "eth0") using an ethernet device called "ens32":

nmcli con add type ethernet con-name eth0 ifname ens32

7. change the hostname:

nmcli general hostname new_hostname

and restart hostnamed to pick up the change (your shell prompt won't change until you exec a new shell or reboot):

systemctl restart systemd-hostnamed