Automatically Check RSYNC and Restart if Stopped

I occa­sion­al­ly use RSYNC to syn­chro­nize large direc­to­ries of files between servers. This is espe­cial­ly use­ful if you’re mov­ing a client from one serv­er to anoth­er and they have alot of sta­t­ic files that are always chang­ing. You can copy the files and sync them up, all with RSYNC and if your con­nec­tion gets cut off, it will start where it left off. It will also grab changes to files that have already been RSYNCd.

I ran into an issue with RSYNC recent­ly, where­in the RSYNC process was run­ning in the back­ground; but was ter­mi­nat­ing due to errors sim­i­lar to the fol­low­ing. These con­nec­tions were prob­a­bly relat­ed to the slow and unsta­ble con­nec­tion to the remote serv­er.

rsync: writefd_unbuffered failed to write 998 bytes to socket [sender]: Broken pipe (32)
rsync: connection unexpectedly closed (888092 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) [sender=3.0.6]

Giv­en that I was trans­fer­ring files through a rel­a­tive­ly bad inter­net con­nec­tion and received this error a half dozen times over a cou­ple of days, I decid­ed the best way to han­dle it, would be to write a cron script. This cron script should check for the RSYNC process and start it if it isn’t run­ning.

Cus­tomize this script for your own pur­pose, to check for your RSYNC process and start it if it isn’t run­ning.

echo "checking for active rsync process"
COUNT=`ps ax | grep rsync | grep -v grep | grep -v | wc -l` # see how many are running
echo "there are $COUNT rsync related processes running";
if [ $COUNT -eq 0 ] 
	echo "no rsync processes running, restarting process"
	killall rsync  # prevent RSYNCs from piling up, if by some unforeseen reason there are already processes running
	rsync -avz -e "ssh" /home/ccase/syncdirectory/ 

Crontab Entry

Save the script in the appro­pri­ate cron direc­to­ry, or add it to the cron.d direc­to­ry and put a crontab entry in, to run it at the desired inter­val. This will have it run every 10 min­utes.

*/10 * * * * ccase /etc/cron.d/

No More Worries

Now you can move onto oth­er things, with the knowl­edge that your RSYNC will not just fail and leave the work undone. It prob­a­bly would­n’t hurt to check on it at first and from time to time; but there’s alot less to wor­ry about!

Mounting CIFS Shares At the LINUX Command Line or in /etc/fstab

Lin­ux makes it rel­a­tive­ly easy to mount shared dri­ves either man­u­al­ly, at the com­mand line, or auto­mat­i­cal­ly, by con­fig­ur­ing an entry in /etc/fstab. Here is the basic syn­tax of our mount com­mand.

[ccase@midas ~]$ sudo mount -t cifs  -o username=<share username>,password=<share password>,<additional options> //<name or ip of server>/<share name> <folder to mount to>

Here is an exam­ple of mount­ing our CIFS share to a fold­er named myshare. We are using the option ro to mount the share read only.

[ccase@midas ~]$ sudo mount -t cifs  -o username=admin,password=secret,ro // myshare

If we want to make this auto­mat­ic, it can eas­i­ly be con­fig­ured in /etc/fstab/ to mount after the net­work comes up. Here is the basic syn­tax you would use in /etc/fstab/

//<name or ip of server>/<share name> <folder to mount to> cifs  username=<share username>,password=<share password>,_netdev,<additional options>   0 0

Here is an exam­ple of mount­ing our CIFS share auto­mat­i­cal­ly to /mnt/myshare/. We are using the option _netdev, to tell it to attempt the mount only after the net­work has come up and ro, to mount the share read only.

// /mnt/myshare cifs  username=admin,password=secret,_netdev,ro   0 0

Using the Linux Command Line to Find and Copy A Large Number of Files from a Large Archive, Preserving Metadata

One of my recent chal­lenges is to go through an archive on a NAS and find all of the .xlsx files, then copy them; pre­serv­ing as much of the file meta­da­ta (date cre­at­ed, fold­er tree, etc) as pos­si­ble, to a spec­i­fied fold­er.  After this copy, they will be gone through with anoth­er script, to rename the files, using the meta­da­ta, where they will then be processed by an appli­ca­tion, which uti­lizes the name of the file in its process.

The part I want to share here, is find­ing the files and copy­ing them to a fold­er, with meta­da­ta pre­served.  This is where the pow­er of the find util­i­ty comes in handy.

Since this is a huge archive, I want to first pro­duce a list of the files, that way I will be able to break this up into two steps. This will pro­duce a list and write it into a text file.  I am first going to run a find com­mand on the vol­ume I have mount­ed called data in my Vol­umes fold­er.

find /Volumes/data/archive/2012 -name '*.xlsx' > ~/archive/2012_files.txt

Now that the list is saved into a text file, I want to copy the files in the list, pre­serv­ing the file meta­da­ta and path infor­ma­tion, to my archive fold­er.  The cpio util­i­ty accepts the paths of the files to copy from stdin, then copies them to my archive fold­er.

cat ~/archive/2012_files.txt | cpio -pvdm ~/archive

Explicitly Setting log4j Configuration File Location

I ran into an issue recent­ly, where an exist­ing log4j.xml con­fig­u­ra­tion file was built into a jar file I was ref­er­enc­ing and I was unable to get Java to rec­og­nize anoth­er file that I want­ed it to use instead.  For­tu­nate­ly, the solu­tion to this prob­lem is fair­ly straight­for­ward and sim­ple.

I was run­ning a stand­alone appli­ca­tion in lin­ux, via a bash shell script; but this tech­nique can be used in oth­er ways too.  You sim­ply add a para­me­ter to the JVM call like the exam­ple below.

So the syn­tax is basi­cal­ly:

java -Dlog4j.configuration="file:<full path to file>" -cp <classpath settings> <package name where my main function is located>

Lets say I have a file named log4j.xml in /opt/tools/myapp/ which I want to use when my appli­ca­tion runs, instead of any exist­ing log4j.xml files.  This can be done by pass­ing a JVM flag -Dlog4j.configuration to Java.

Here is an exam­ple:

java -Dlog4j.configuration="file:/opt/tools/myapp/log4j.xml" -cp $CLASSPATH  my.standalone.mainClass;

With that change, as long as your log4j file is set up prop­er­ly, your prob­lems should be behind you.

Toshiba P870: Installing Linux Mint

I have recent­ly start­ed using a Toshi­ba P870 lap­top and decid­ed to install Lin­ux Mint 13 Maya (Cin­na­mon Edi­tion) on it, due to its ease of use and over­all secu­ri­ty sound­ness.

Being as the Toshi­ba P870 is a rel­a­tive­ly new lap­top, with some com­po­nents’ dri­vers not hav­ing been includ­ed in the instal­la­tion files of Mint, it has been a lit­tle tricky. I’m shar­ing this, for those who want to install mint on the P870 or sim­i­lar lap­tops.  This should save you a cou­ple hours of search­ing.  It will get you the dri­vers you need and get you up and run­ning.

We’re going to dis­cuss how­to:

  • Down­load, Burn and Run the Lin­ux Mint Installer
  • Install the miss­ing net­work dri­vers, both WIFI and Eth­er­net, so you can con­nect to the inter­net
  • Fix the inter­nal sound prob­lem that caus­es the inter­nal speak­ers not to pro­duce any sound
  • Update: Installing the SD Card dri­ver — since this arti­cle was orig­i­nal­ly writ­ten, I also fig­ured out that the SD card dri­ver needs to be installed as well.

With these basics out of the way, you can then use your Toshi­ba P870 lap­top for just about any­thing you want. Con­tin­ue read­ing “Toshi­ba P870: Installing Lin­ux Mint”

Tunneling Through a Remote Firewall Using SSH Commands

If you’re deal­ing with sys­tems behind a fire­wall it’s almost inevitable that you will need to tun­nel into those sys­tems from time to time.  For­tu­nate­ly, there are some quick & easy com­mands to accom­plish this.  In this exam­ple, we are going to use a Mac OSX or lin­ux-based sys­tem, to gain access to a web server’s port 80 on a fire-walled serv­er.

Let’s say the domain of the remote serv­er is, the fire-walled serv­er has an IP address of and the fire­walled serv­er has a web serv­er at port 80.  We need to choose an unused port on our own sys­tem, in this case we’ll use 2020.

So our side of the tun­nel is going to be http://localhost:2020/ and the oth­er side of the tun­nel will be

ssh -L 2020:'s password:

So, now port 80 on the fire-walled serv­er will be acces­si­ble by sim­ply point­ing your web brows­er to http://localhost:2020/.  To ter­mi­nate the tun­nel, sim­ply exit the shell.

Tar/GZip Files in One Operation, Unattached to the Terminal Session

When you’re try­ing to move a large block of files, its often use­ful to do so in one com­mand and to be able to close your ter­mi­nal win­dow (or allow it to time out). If you run a com­mand under nor­mal cir­cum­stances, los­ing the con­nec­tion can cause your com­mand to ter­mi­nate pre­ma­ture­ly, this is where nohup (No HangUP — a util­i­ty which allows a process to con­tin­ue even after a con­nec­tion is lost) comes in.

Let’s say we have a large direc­to­ry to back­up, which we want to first tar, then gzip; keep­ing the com­mand non-depen­dent on the con­ti­nu­ity of the ter­mi­nal ses­sion. Con­tin­ue read­ing “Tar/GZip Files in One Oper­a­tion, Unat­tached to the Ter­mi­nal Ses­sion”

The Paradigm Shift to Accompany The Advent of Cheap Computing

Today, it came to my atten­tion that a LINUX com­put­er, priced between $25–35, is now avail­able. This com­put­er is called the Rasp­ber­ry Pi.

It sure looks like com­put­ing is going to take on a whole new dimen­sion in the com­ing years. No longer are there going to be sig­nif­i­cant finan­cial bar­ri­ers to acqui­si­tion, mean­ing they will be every­where and clus­ters of extreme­ly cheap com­put­ers will add yet anoth­er dimen­sion to cloud com­put­ing.

I think this will mean that com­put­er tech­ni­cal skills are going to even­tu­al­ly be syn­ony­mous to lit­er­a­cy. Sure­ly com­put­ers and com­put­ing will con­tin­ue to evolve at a fever­ish pace, elim­i­nat­ing much of the unnec­es­sary human toil.
Con­tin­ue read­ing “The Par­a­digm Shift to Accom­pa­ny The Advent of Cheap Com­put­ing”