Restrict a Linux User’s Access: Only Allowing SCP/SFTP, no SSH

The stan­dard tech­niques for restrict­ing a Lin­ux user account, does not allow for file trans­fers to/from the user’s home direc­to­ry. In my expe­ri­ence it is use­ful to have cer­tain account types which are only allowed to upload/download files from their home direc­to­ry; but not login and run shell commands.

This is easy to do with a shell called rssh (Restrict­ed Secure Shell); but you must first install it, because it does not typ­i­cal­ly come pack­aged with most dis­tri­b­u­tions of Lin­ux.

Installing RSSH

Locate the most appro­pri­ate pack­age for your dis­tri­b­u­tion of Lin­ux at the down­load site. Once you have locat­ed the RPM you will need do the fol­low­ing steps, sub­sti­tut­ing your cho­sen pack­age for the RPM.

[root@Internal ~]# <strong>wget http://packages.sw.be/rssh/rssh-2.3.2-1.1.el3.rf.x86_64.rpm</strong>
--2010-10-11 20:36:21--  http://packages.sw.be/rssh/rssh-2.3.2-1.1.el3.rf.x86_64.rpm
Resolving packages.sw.be... 85.13.226.40
Connecting to packages.sw.be|85.13.226.40|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://rpmforge.sw.be/redhat/el3/en/x86_64/rpmforge/RPMS/rssh-2.3.2-1.1.el3.rf.x86_64.rpm [following]
--2010-10-11 20:36:21--  http://rpmforge.sw.be/redhat/el3/en/x86_64/rpmforge/RPMS/rssh-2.3.2-1.1.el3.rf.x86_64.rpm
Resolving rpmforge.sw.be... 85.13.226.40
Reusing existing connection to packages.sw.be:80.
HTTP request sent, awaiting response... 200 OK
Length: 45053 (44K) [application/x-rpm]
Saving to: “rssh-2.3.2-1.1.el3.rf.x86_64.rpm”
100%[====================================================================================================================================================>] 45,053      94.6K/s   in 0.5s
 
2010-10-11 20:36:22 (94.6 KB/s) - “rssh-2.3.2-1.1.el3.rf.x86_64.rpm” saved [45053/45053]
 
[root@Internal ~]# rpm -ivh rssh-2.3.2-1.1.el3.rf.x86_64.rpm
warning: rssh-2.3.2-1.1.el3.rf.x86_64.rpm: Header V3 DSA signature: NOKEY, key ID 6b8d79e6
Preparing...                ########################################### [100%]
1:rssh                   ########################################### [100%]

Updating Access Permissions

Now you should be able to set a user’s login shell to RSSH. Here is what the orig­i­nal line will usu­al­ly look like.

joe:x:501:501::/home/joe:/bin/bash

This is what the updat­ed line will look like.

joe:x:501:501::/home/joe:/usr/bin/rssh

What Happens if the User Attempts to SSH in After Access is Restricted

Now if joe attempts to login via SSH, the fol­low­ing will occur:

[root@Internal ~]# ssh joe@localhost
The authenticity of host 'localhost (127.0.0.1)' can't be established.
RSA key fingerprint is b5:39:02:23:01:a5:ff:b9:c1:aa:01:a9:69:21:a4:e0.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
joe@localhost's password: 
 
This account is restricted by rssh.
This user is locked out.
 
If you believe this is in error, please contact your system administrator.
 
Connection to localhost closed.

Efficiently Loading Paginated Results From MySQL with Just One Query Per Page

There are many sit­u­a­tions in which pag­i­na­tion of query results is very use­ful, espe­cial­ly for per­for­mance opti­miza­tion.  In most of these kinds of sit­u­a­tions, the pag­i­nat­ing of results requires you to deter­mine the total num­ber of results, so the appli­ca­tion knows the num­ber of pages available.

The most com­mon way to do this, is to use two queries; one which obtains the count of results and one which obtains a par­tic­u­lar page of results.  This method has the poten­tial to be even less opti­mal than load­ing the entire set of results, how­ev­er; due to the fact that two queries are now nec­es­sary, where­as before there was only one query.

In most data­base sys­tems, it is pos­si­ble to over­come this lim­i­ta­tion; though the tech­nique is spe­cif­ic to the par­tic­u­lar data­base you are using.  This exam­ple explains how to do this in MySQL.

Here is the sub-opti­mal exam­ple, using two queries; it is load­ing the first page of results. The LIMIT 0, 40 means it will start at posi­tion 0 (the begin­ning) and obtain a set of 40 results.

SELECT count(*) FROM my_table WHERE timestamp >= '2010-03-15' and timestamp <= '2010-08-01';
 
SELECT id, timestamp, field1, field2, field3 FROM my_table WHERE timestamp >= '2010-03-15' and timestamp <= '2010-08-01' ORDER BY id LIMIT 0, 40;

Here is a more opti­mal exam­ple, which uses two state­ments, only one of which is a real query.  Every­thing is done dur­ing the first state­ment, the sec­ond state­ment mere­ly loads the count, which was cal­cu­lat­ed dur­ing the first statement.

SELECT SQL_CALC_FOUND_ROWS id, timestamp, field1, field2, field3 FROM my_table WHERE timestamp >= '2010-03-15' and timestamp <= '2010-08-01' ORDER BY id LIMIT 0, 40;
 
SELECT found_rows() AS cnt;

One of the draw­backs of SQL_CALC_FOUND_ROWS, or count(*) in gen­er­al; is the fact that, by run­ning these cal­cu­la­tions, you lose some of the ben­e­fit of pag­i­na­tion. This is because your data­base is required to exam­ine all of the effect­ed data in order to gen­er­ate an accu­rate count.

Depend­ing on the specifics of your my.cnf con­fig­u­ra­tion file, the first state­ment will cache part of the infor­ma­tion, caus­ing it to exe­cute faster when sub­se­quent pages are loaded. In some of my own test­ing I have seen a sig­nif­i­cant speedup after the first page is loaded. 

If you want to get the most out of your appli­ca­tion, you will like­ly need to do a com­bi­na­tion of appli­ca­tion tun­ing, query tun­ing and data­base tun­ing. Gen­er­al­ly you will want to start by tun­ing the appli­ca­tion itself; this way you’re elim­i­nat­ing any bot­tle­necks inher­ent in the appli­ca­tion. Then you’ll need to do some query opti­miza­tions if the appli­ca­tion still isn’t fast enough. Final­ly, you’ll want to look into what con­fig­u­ra­tion changes you can make on the data­base in order to speed things up from the source.

Looking Towards Calmer Seas Ahead

For the first time in a long time, as far as soft­ware work is con­cerned at least, I’ve felt more and more like the May­tag Repair­man. After many weeks up to my ears in cus­tomer issues; these hec­tic times may be draw­ing to a close, with more peace­ful waters ahead.

I remem­ber when one of the Com­put­er Sci­ence pro­fes­sors at UAHuntsville once told me that soft­ware real­ly is a machine, though most peo­ple don’t think of it as such. There real­ly are mov­ing parts, and though they are micro­scop­ic; their design makes a huge dif­fer­ence when you start to use some­thing heav­i­ly in a busi­ness crit­i­cal envi­ron­ment. These are the lessons I have been learn­ing first-hand over the past cou­ple of years.

I start­ed out at the job I am work­ing now as a novice, who knew a lit­tle about every­thing; work­ing on a soft­ware project that, while gen­er­al­ly well designed, had some scalability/stability issues in cer­tain areas; as well as some user expe­ri­ence issues which need­ed addressing.

Now I’ve been through some major storms, in a man­ner of speak­ing, and I feel like I can real­ly do some­thing sig­nif­i­cant now. I’ve seen pesky issues arise as our cus­tomers grow; some­times spend­ing weeks track­ing them down and fix­ing them.

In the process of mov­ing through these issues, I’ve learned a great deal about tun­ing up MySQL and HQL (Hiber­nate Query Lan­guage); as well as tun­ing the appli­ca­tion itself. I’ve learned that the most impor­tant place to start opti­miz­ing is the appli­ca­tion itself; because oth­er­wise you are just wast­ing your avail­able resources.

I’ve got­ten into this rou­tine where I hunger for more wis­dom and hands-on expe­ri­ence. Often, when I get a free evening, I’ll go to the soft­ware sec­tion of the book­store, drink cof­fee and absorb as much infor­ma­tion as I can. I real­ly love keep­ing up with the lat­est inno­va­tions. When the oppor­tu­ni­ty aris­es, I like to employ what I learn in the projects I’m work­ing on.
Con­tin­ue read­ing “Look­ing Towards Calmer Seas Ahead”

Writing Complex Web Apps With Google Web Toolkit (GWT)

The Google Web Toolk­it (GWT) is a rel­a­tive­ly new set of open source tools, devel­oped by Google; which aims to allow devel­op­ers to write much of the client-side code as Java. This Java code is then com­piled into the appro­pri­ate JavaScript code, to run on the user’s web brows­er. Basi­cal­ly, the Google team has come up with a way of allow­ing devel­op­ers to write most of their web appli­ca­tions in Java, instead of hav­ing to switch between Java and JavaScript; thus min­i­miz­ing the amount of cross-brows­er JavaScript development/testing.

The devel­op­ers of GWT have cho­sen to focus their efforts on Eclipse as the pre­ferred IDE; though you are not lim­it­ed to Eclipse. One of the great ben­e­fits of GWT, is that you can now step through most of your appli­ca­tion in the Eclipse debug­ger. This makes devel­op­ing the client-side aspects of your app much eas­i­er and more sta­ble than hav­ing to use JavaScript debug­ging tools like Fire­bug.

Attached is a Google Tech Talk from Google devel­op­er Bruce John­son, in which he explains GWT in great detail. The video is a cou­ple of years old; but it is still a good intro to GWT.

Google Tech TalksJune 24, 2008


YouTube DirectE­clipse Day at the Google­plex: GWT in Eclipse 

Eclipse Day at the Googleplex

Speak­er: Bruce John­son, Google

Build­ing high-per­for­mance Ajax eas­i­ly with Google Web Toolk­it (GWT) in Eclipse has always been pos­si­ble, but soon it will be down­right easy. Bruce will present GWT’s upcom­ing Eclipse plu­g­in that helps novices get start­ed and lets experts fly.

Sending Mail in Shell Scripts via an External Server with Nail

If you’ve ever tried send­ing email via the com­mand line, using the mail util­i­ty, you may find that the method can be unre­li­able in some cas­es.  The email mes­sages are often inter­cept­ed by spam bots, fil­tered by secu­ri­ty pro­grams, etc.  A more ele­gant and sta­ble alter­na­tive, is to use your exist­ing email serv­er to send the mes­sage.  Using the pro­gram nail makes this an easy task to do via the com­mand line.

The fol­low­ing exam­ple shows you how to send a sim­ple mes­sage with attach­ment. Here is the syn­tax for send­ing a mes­sage with nail.

echo "" | nail -s "" -a   ...

In order for nail to func­tion, you must have the .mail con­fig­u­ra­tion file in your path. Here is a sam­ple .mail con­fig­u­ra­tion file to get you start­ed quickly.

set smtp=smtp://yourhost.com
set from="yourname@yourhost.com (Display Name)"
set smtp-auth=login
set smtp-auth-user=your_username
set smtp-auth-password=your_password

Streaming Data as Downloadable Files in Struts Servlet

One way to stream data to the client, is to use the Print­Writer, a library which allows you to direct­ly manip­u­late the out­put stream which is sent to the client. One of the ben­e­fits of stream­ing the out­put to the client with Print­Writer, is the abil­i­ty to send data as it is gen­er­at­ed; instead of hav­ing to buffer all of the data on the serv­er, then send the data to the client after the entire set is generated.

For con­ve­nience and espe­cial­ly for large files, it is impor­tant to mod­i­fy the HTTP head­ers in HttpServle­tRe­sponse, instruct­ing the clien­t’s brows­er to save the file to disk.

The fol­low­ing is a min­i­mal  exam­ple, which shows how to dump a dum­my CSV text file as a down­load­able file in a struts Action Servlet.

public class CsvDataDumpAction extends Action {
 
	public ActionForward execute
	(ActionMapping mapping, ActionForm form, HttpServletRequest request, HttpServletResponse response)
	{
		// declare the PrintWriter object for later instantiation
		PrintWriter pw = null;
 
		// modify the HTTP response header, so the file is downloaded as DataDump.txt
		response.setHeader("Content-disposition", "attachment; filename=DataDump.txt");
		response.setContentType("application/octet-stream");
 
		// catch the IOException generated by the PrintWriter
		try {
			// Sample header with four fields
			String header = "Field1,Field2,Field3,Field4";
			pw.println(header);
			// flush the buffer, sending the header line to the client
			pw.flush();
 
			// generate 1000 lines of dummy test data, with the field name, followed by the number of the row
			for(int i = 0; i &lt; 1000; i++) {
				pw.println("Field1_"+i+",Field2_"+i+",Field3_"+i+",Field4_"+i);
				// flush the buffer after each line is generated,
				// sending the data to the client as it is generated
				pw.flush();
			} 
 
			// show stack traces for the PrintWriter in the logs
		} catch (IOException e) { 	e.printStackTrace(); 	}
		return null;
	}
}

Bash Shell Script to Convert All MySQL Tables to use the Innodb Engine

The fol­low­ing, when run in the Bash shell, will con­vert the engine for all of your MySQL tables to InnoDB.

mysql -B -N -e "SHOW TABLES" -u  <username> --password=<password> <databasename> | while read table; \
do \
     echo "+ Converting Table $table"; \
     mysql -B -N -e "alter table $table engine=innodb" -u <username> --password=<password> <databasename>; \ 
done;

Often, if you have a heav­i­ly used data­base, you will want to con­sid­er the Inn­oDB engine. By default, the MySQL data­base engine is MyISAM. How­ev­er, the Inn­oDB engine has many advan­tages; par­tic­u­lar­ly for high uti­liza­tion envi­ron­ments. With Inn­oDB data-integri­ty is main­tained through­out the entire query process because it is trans­ac­tion safe. 

Inn­oDB also pro­vides row-lock­ing, as opposed to table-lock­ing. With row-lock­ing, while one query is busy updat­ing or insert­ing a row, anoth­er query can update a dif­fer­ent row at the same time. Hence Inn­oDB has supe­ri­or fea­tures for mul­ti-user con­cur­ren­cy and performance.

Copying Yesterday’s Exceptions with Stack Traces from Logs, Then Emailing To Administrators

When you have a java appli­ca­tion serv­er which gen­er­ates a great deal of logs, it can be tricky to find the most impor­tant infor­ma­tion, espe­cial­ly if you have detailed log­ging. For­tu­nate­ly grep is capa­ble of doing this very well.

The fol­low­ing com­mand will gath­er all WARN, ERROR, FATAL, and Excep­tion stack traces. This com­mand can be very use­ful for Java log mon­i­tor­ing scripts.

cat /jboss-4.0.2/server/default/log/server.log | grep "ERROR\|FATAL\|Exception\|at.*\.java\:.*"

Understanding this expression

In this expres­sion ‘\|’ is used as an OR oper­a­tor to look for dif­fer­ent pat­terns. ‘WARN’, ‘ERROR’, and ‘FATAL’ pat­terns are used to fil­ter first line of the log event that can pos­si­bly con­tain an excep­tion at WARN, ERROR and FATAL log­ging lev­els. We lat­er fil­ter the first line of the stack trace with ‘Excep­tion’ as the first line of the stack trace usu­al­ly has the Excep­tion name fol­lowed by the excep­tion mes­sage e.g. ‘java.lang.NullPointerException’.

After this you will have the stack trace ele­ments which start with ‘at’ and end with pat­tern ‘(FileName.java:lineNo)’ e.g. at java.lang.Thread.run(Thread.java:595). These stack trace ele­ments are fil­tered with ‘at.*\.java\:.*’. All these pat­tern’s OR’ed togeth­er can fil­ter the com­plete stack trace in log at WARN, ERROR and FATAL log lev­el. Some false pos­i­tives may also get fil­tered out with this com­mand if the log com­ment has words like WARN, ERROR, FATAL, Exception. 

source: com­put­er tech­nol­o­gy roller

Filtering by Date: Yesterday’s Logs

If you want to fil­ter the log­files after a cer­tain date, the fol­low­ing com­mand is very use­ful. It gets the date for yes­ter­day, using the date for­mat yyyy-mm-dd, then is uses sed to print all of the lines after the spec­i­fied date. This is a good com­mand to run after mid­night, to retrieve the pre­vi­ous day’s logs.

cat /jboss-4.0.2/server/default/log/server.log | sed "1,/$(date --date='yesterday' '+%Y-%m-%d')/d"

Putting it All Together

Daily Log Monitor Script to Email Error Stack Traces to the Administrator

Here is a com­plete mon­i­tor­ing script I wrote, which emails me all of the pre­vi­ous days errors stack traces. I have it run­ning in cron.daily in order to reg­u­lar­ly send me the jboss error stack traces.

#!/bin/bash

# email addresses to send the message to
email="address@host.com"

# determine the number of running instances
errors=$(cat /jboss-4.0.2/server/default/log/server.log | sed "1,/$(date --date='yesterday' '+%Y-%m-%d')/d" | grep "ERROR\|FATAL\|Exception\|at.*\.java\:.*")

subject="JBOSS DAILY ERROR DIGEST FOR: $(hostname)"

echo "$errors" | /bin/mail -s "$subject" "$email"

Monitoring Process Counts and Alerting Via Email

Below is a sim­ple script called monitor_jboss, which checks to see if jboss is run­ning and whether or not too many instances are cur­rent­ly run­ning. I found a need to write this script because we have some cron scripts which auto­mat­i­cal­ly restart JBoss each day and the JBoss shut­down script itself some­times fails to prop­er­ly shut down, caus­ing some quirky behavior.

If it deter­mines that one of the fol­low­ing con­di­tions are true, it sends a short email to the address spec­i­fied in the vari­able email describ­ing the problem.

  • JBoss is not run­ning at all
  • Jboss has more than max instances running

This script is then placed in /etc/cron.d/cron.hourly/ where it will check the sys­tem once an hour and send an email as appropriate.

#!/bin/bash

# email addresses to send the message to
email="address@host.com"

# maximum number of concurrently running instances allowed
max=1

# determine the number of running instances
count_running_jbosses=$(ps aux | grep jboss | grep -v grep | grep -v monitor_jboss  | wc -l)

if [ $count_running_jbosses -eq "0" ]            # jboss isn't running
then
        message="JBoss Is Currently Not Running"
fi
if [ $count_running_jbosses >  $max ]           # too many jboss instances running
then
        message="JBoss Is Currently Running $count_running_jbosses instances; the maximum is $max"
fi

subject="JBOSS MONITORING ALERT FOR: $(hostname)"

echo "$message" | /bin/mail -s "$subject" "$email"

Locating Duplicate Entries In A MySQL Table

If you don’t have unique index­es on your table, it is like­ly that you will occa­sion­al­ly have entries that are dupli­cat­ed. This can often hap­pen because of a soft­ware bug or pos­si­bly a user error. Some appli­ca­tions even choose not to have unique index­es for per­for­mance rea­sons; though this hap­pens at the cost of data integrity.

The best I know of, to demon­strate how to locate the dupli­cate entries, is to use an exam­ple. So let’s say you have the fol­low­ing table:

mysql> desc result;
+-------------------------+--------------+------+-----+---------+----------------+
| Field                   | Type         | Null | Key | Default | Extra          |
+-------------------------+--------------+------+-----+---------+----------------+
| id                      | bigint(20)   | NO   | PRI | NULL    | auto_increment |
| comment                 | varchar(255) | YES  |     | NULL    |                |
| key1_id                 | bigint(20)   | YES  | MUL | NULL    |                |
| key2_id                 | bigint(20)   | YES  | MUL | NULL    |                |
| str1                    | varchar(255) | YES  | MUL | NULL    |                |
| result                  | float        | YES  |     | NULL    |                |
+-------------------------+--------------+------+-----+---------+----------------+

In this exam­ple, let’s say you want the fol­low­ing field set to have a unique com­bi­na­tion of val­ues. In oth­er words, you want these fields to be a sort of com­pos­ite key; so that there are no two records in the data­base which have the same val­ues for these three fields.

| key1_id             | bigint(20)   | YES  | MUL | NULL    |                |
| key2_id             | bigint(20)   | YES  | MUL | NULL    |                |
| str1                | varchar(255) | YES  | MUL | NULL    |                |

Well, let’s say that you are con­cerned about hav­ing dupli­cate val­ues in your data. You need to have an effec­tive way of query­ing the data­base in order to find the dupli­cate records. The fol­low­ing query would locate dupli­cates of the above field-set.

mysql> select id, str1, key1_id, key2_id, count(*)
from test_result group by str1,key1_id,key2_id
having count(*) > 1;

In the exam­ple you group by the list of fields that you want to use as your com­pos­ite key and select the fields you want to see in your out­put. The above out­put looked like the following:

+-------+----------+--------------+-------------+----------+
| id    | str1     | key1_id      | key2_id | count(*) |
+-------+----------+--------------+-------------+----------+
| 17293 | EDDP     |         1951 |        1923 |        4 |
| 17302 | GRAV     |         1951 |        1923 |        4 |
| 17294 | MTDN     |         1951 |        1923 |        4 |
| 17301 | OXID     |         1951 |        1923 |        4 |
| 17303 | PH       |         1951 |        1923 |        4 |
| 17288 | XTC      |         1951 |        1923 |        4 |
+-------+----------+--------------+-------------+----------+
6 rows in set (0.52 sec)

What we have in this exam­ple is four of each of these records which are sup­posed to be unique.