skip page navigation Oregon State University

Mike on Mirrors

Ed. Note: You may recall our recent interview with student developer Corbin Simpson. This week, we're bringing you an interview with Mike Cooper, one of the OSU Open Source Lab's student system administrators. Mike was kind enough to share his thoughts with us through an interview with Anthony Casson, the Lab's student writer. Stay tuned for more interviews with our students in the coming weeks.

What are you currently studying?

I am a junior studying computer science, though I am thinking about adding a minor in physics and/or math to my degree.

How long have you been at the OSL?

March 1st will make one year.

Would you say open source is an activity exclusively for coders, or is it available to anyone?

Open source is run by the community. Without developers, users, thinkers, doers, artists, word-of-mouth-spreaders, devil's advocates, competition, support, and most importantly people, open source cannot survive. If you think you can't contribute because you can't write code or can't host a server, you're wrong. Using software and talking about it is as important a contribution as submitting patches and running mirrors.

One of your projects is Cfengine to Puppet migration. Can you give us details?

Since the Puppet Workshop in August, we have been working at a low level to architect and design a replacement for our Cfengine system in Puppet. Over winter break and since then, this development has picked up the pace, and is now a major project for 3 of the 5 student system administrators.

For a bit of background, Cfengine and Puppet are configuration-management systems. What this means is that there is a central server that manages configuration. We have a group of Git repositories that control everything. If we want to install Apache on a server, we write a configuration in the relevant language that tells Cfengine/Puppet what packages are needed, what configuration files are needed, what services should be running, and anything else that it needs.

The beauty of the system is that we can then take this configuration and apply it to another similar system. After the initial set up, in theory, it is as easy as saying node { "": import apache } and puppet takes care of the rest.

The thing I like about it is that Puppet is more flexible than Cfengine, most of the time. It allows conditional execution based on many "facts" about the system. It also has a more powerful description language. When needed, it can even be extended by writing Ruby libraries, of which we have already taken advantage to make it work better with Gentoo. Cfengine can't match this level of flexibility; to make it do more than it was designed for, we rely on external scripts, and other hacks that are more brittle than we would like.

The Bits Must Flow:
Combined Bandwidth of Downloads from, February 12 - 18, 2011
in Gigabits per Second

And what about your other major project, FTP mirroring?

The OSL hosts This website is actually 3 servers -- ftp-osl in our data center, ftp-chi in a TDS datacenter in Chicago, and ftp-nyc, in another TDS datacenter in New York. Each of the three servers has about 6 TB of redundant storage in two arrays each containing twenty-five 146 GB hard drives. For those keeping track, that’s 150 hard drives between all the servers.

There are two kinds of mirroring we provide. For projects that have an existing mirror infrastructure (even if it is just one "master" mirror) we do a scheduled pull of files, ranging from once a day to once an hour. ftp-osl uses Rsync to pull the content from upstream. It then sets a trigger for the other mirrors, which pull down new changes one per minute. The second kind is for projects that don't have a good system set up for automatic mirroring. They can manually upload files to our mirror in Corvallis, OR and trigger the updates to the other mirrors by hand.

To put this all in context, consider a few things we mirror: Arch Linux, Centos, Cygwin, Debian, Drupal, Eclipse, Fedora, Gentoo, Jenkins, TheDocumentFoundation (LibreOffice), Meego, Replicant, Ubuntu, and XBMC. When you download these projects, there is a chance that you are downloading it from one of our servers.

We are always adding new mirrors, and no system runs perfectly. We have systems that monitor the health of the mirrors, and occasionally things break and need fixing. Sometimes upstream mirrors go bad, and we have to find another upstream provider. Other times, a mirror will get disrupted by network issues, leaving partial files and stale locks around that need to be cleaned up manually.

Tell us about the Puppet training workshop you attended.

A few people from Puppet Labs came to Corvallis and held a three day workshop about Puppet.

We started out learning basic syntax and the flow of the configuration language. We learned how to write modules of code, and use Puppet to manage a single node. We tested all of this. Then we moved on and made a simple client/server setup. They provided a virtual machine image that we ran on our laptops. We then moved on to learn about the server client architecture Puppet supports and some of the more advanced features. Basically we got a three day crash course in how to think the Puppet way.

This was really useful because even though Puppet uses language familiar to most programmers like "variables", "class", "function", or "inheritance", they work differently than most people are used to. Doing Puppet well requires a particular methodology to get all the bits in a row to make a large system, but once you can get it in your head, it is very powerful.

What’s your favorite part about working at the OSL?

I get to work with some great people, both in the building and online across the world. I get paid to support open source projects, and I get to learn while I do it. Compared to other jobs that college students tend to have, the OSL is great. For the most part I set my hours; I can work full-time when school is on break, if I want.

Finally, what has been challenging for you, and what advice would you share with newcomers?

To newcomers, or anyone for that matter, I will make two suggestions. First, don't be afraid to ask questions. Second, don't expect everyone else to solve your problems. In most communities beginners are encouraged to learn, and people with experience will gladly help others learn. That being said, if you don't research what you are talking about and expect others to do everything for you, in lenient communities you won't ever build a reputation, and in some communities you will get laughed at and/or flamed out of the irc channel, mailing list, forum, etc.

So if you want to get involved, do it! If you don't know where to start, ask someone. Good resources to learn from are wikis and documentation, and bug trackers can help you find things to do, if you want to work on the project. Mailing lists and irc channels are usually the best way to communicate with members of the community. Finally keep in mind that many of the people in the channel have day jobs that aren't the project.

Many thanks to Anthony and Mike for this interview!

See our news archive for other OSUOSL news stories.