Building the ACM Cluster, Part 11a: Setting up rpmbuild environment

Up to this point, we haven’t built any custom software for the cluster.  I’ve tried very hard to use mostly off the shelf software.  However, this has to change.  Several of the major components we’re going to use (xen, the fiber card driver, ceph) are not available in the CentOS repositories (or are too old).  So we’re going to build them ourselves.

However, rather than build them on a node-by node basis (which would make a single install hours long…), we’re going to build packages.  CentOS uses the rpm package format to distribute prebuilt software.  So we’ll be building RPMs.

Where do I build RPMs?

Many resources recommend not building RPMs as root.  This is quite sensible if you’re doing it on a machine that can’t easily be rebuilt - that way you can’t accidentally overwrite an important file.  However, since you 

Installing rpmbuild

The primary tool used to build rpms is called, obviously enough rpmbuild.  We also need a …

Continue reading

Building the ACM VM Cluster, Part 10: Operating system image build

Now that we’re done with network configuration.  Now, lets actually build an operating system to use on the nodes!

Lets go ISO Huntin'!

The first step in building operating system install images is to get the full operating system images.  Not netboot, but a fully installable version.  For CentOS the mirrors page is a good place to start your hunt.  Personally, I downloaded both DVD images (CentOS-6.3-x86_64-bin-DVD1.iso and CentOS-6.3-x86_64-bin-DVD2.iso), though I suspect that simply the minimal image will cover it.

Import Install Media

Next, we have to import the install media to xCat’s NFS filesystem.  To do this, we’ll use the copycds command.  copycds takes as arguments simply the ISOs you want to import:

# copycds CentOS-6.3-x86_64-bin-DVD1.iso CentOS-6.3-x86_64-bin-DVD2.iso

Sometimes copycds will tell you

Error: copycds could not identify the ISO supplied, you may wish to try -n <osver>

In which case your command will look more like

# copycds -n …
Continue reading

Building the ACM Cluster, Part 9: Setting up masquerade with iptables

Alright! Lets get this started again.  There is one last thing we need to do in order to have networking on the cluster functional.  Right now, the nodes inside the cluster can’t speak to the outside world.  While we set up the head node to be able to speak to things on every interface, we haven’t yet told it how to move traffic from one interface to another.

Making the Gateway

In normal clusters, there are three types of notes - workers, gateways and head nodes.  Workers do whatever task the cluster is intended for.  Head noes manage the workers.  And finally, gateways, which allow the worker nodes to communicate with things outside the cluster.

Gateways are needed because clusters often use IP addresses which are not publicly routeable.  The gateway allows the entire cluster to sit behind one IP address and is in charge of routing traffic properly.  This process is called Network Address Translation.  In many ways, this makes the gateway like your home router.

Anyway, …

Continue reading

End of the semester!

Hey world!  I’ve been inactive here the last month and a half because my semester got intense and I, frankly, didn’t have time.  In this time, I’ve kept working on all my projects, job hunted and got hired.  Phew.  Busy time.

In any case, this is a “hello! still alive!” and a warning that you’re probably going to see me post quite a bit here over the next few days as I catch up on my backlog of documenting things.

Continue reading

Endeavor Time Lapse

Hey everyone!  It’s been a bit of a lazy weekend - I’ve mostly been doing schoolwork.  I had some success with JHUACM cluster stuff that I haven’t written up yet (its fun!  I rewrote the spec file for building RPMs of a proprietary driver!).  However, I’ve come across a cool video to share.

Time Lapse: Endeavor’s Trip Through LA

In case you missed it, the space shuttle Endeavor recently moved off to it’s new home in the California Science Center.  The LA Times’ Brian Chan put together a time-lapse video of the move.  For your convenience, its embedded below:

What struck me most about this video is the meticulous amount of planning that must have gone into this move.  Everything from the devices the shuttle is resting on to having precisely measured the streets and the trees and making sure that there would be space.  Throughout this time lapse the shuttle’s movements look choreographed, perfectly planned for this route.  They even managed …

Continue reading

Building the ACM Cluster, Part 8: Adventures in Routing: Source Based (Multi-homed) Routing

(This post is related to the ACM cluster build.  However, it is really generic systems stuff and not terribly related to the actual cluster build.  It is much more closely related to quirks of JHU networking.)

The Problem

JHU has two distinct networks - firewalled and firewall-free.  (In truth there are more and there are gradations, but these are the two JHUACM has IP allocations on.)  Some services cannot be run form inside the firewalled network.  For these the ACM has a small firewall-free allocation.  Because the cluster will be hosting VMs inside both networks, it needs to be capable of routing traffic from both.  This means doing something called source-based routing or multihomed routing.  This refers to the fact that this machine will have two connections to the internet.  Typically, this is a very rare setup - Multihoming is usually used at the ISP or datacenter level, rather than at the level of the individual box.

The Solution

The solution is to convert linux to …

Continue reading

Building the ACM Cluster, Part 7: Network redux

So I’ve mentioned that I’ve been fighting networking again in the ACM cluster.  I’ve been reworking the network.  This whole adventure began after a conversation with the very knowledgeable nwf, who pointed out that JHU runs two different networks that the ACM systems need access to - CSNet (the JHU Computer Science department’s network) and JHU firewall-free (which has unfiltered access to the internet).  The goal of this rework was to allow the cluster to be on both.  In a situation with more resources, I would have simply bought another network card for each of the gateway nodes.  However, I don’t have those resources and couldn’t find any spare network cards.  nwf then pointed out that I would be able to use 802.1Q vlans to make more virtual ports.

So, here’s how this works:  CSNet and JHU firewall-free (JHUFF) each plug into a single port on the main switches.  These ports are assigned as being on specific vlans that differ from the other …

Continue reading

Well, I disappeared...

So.  I’ve been quiet lately.  I’m still working on everything I was before.  But, the semester is in full swing, so I’ve had exams to study for.  This weekend, however, is going to be mostly ACM stuff.  A lot of that will be cleaning (I’ve made a mess with all the cluster gear lying about), but if I’m lucky, I’ll have a post early next week about the new network configuration for the ACM.  Especially if I can figure out how to get the switches to send well-formed DHCP requests over non-default vlans.  I might also have a writeup of getting the storage stuff built and of kickstarting nodes.  We’ll see.  I’m working on the cluster on many fronts.  And sometimes one distracts me from the others.

Continue reading