Fixing Performance Problems on Your JBoss Web Apps By Diagnosing Blocked Thread Issues

I was once per­plexed by a bizarre per­for­mance issue, I encoun­tered at seem­ing­ly ran­dom inter­vals, in an appli­ca­tion I help to main­tain. The appli­ca­tion kept freez­ing up, with­out any log mes­sages to use for diag­no­sis. This was very frus­trat­ing, because it meant the appli­ca­tion serv­er typ­i­cal­ly had to be restart­ed man­u­al­ly to restore service.

After a bit of research, I learned of thread block­ing, as a poten­tial per­for­mance issue. Being as I was fair­ly cer­tain that the data­base was func­tion­ing with­in accept­able para­me­ters and the serv­er had ample CPU and mem­o­ry to han­dle the load. I sought to deter­mine if thread block­ing was an issue.

I start­ed by sim­ply run­ning a twid­dle com­mand to dump the threads, when­ev­er this per­for­mance prob­lem was report­ed. This showed that the BLOCKED threads were indeed the cause. Con­tin­ue read­ing “Fix­ing Per­for­mance Prob­lems on Your JBoss Web Apps By Diag­nos­ing Blocked Thread Issues”

Efficiently Loading Paginated Results From MySQL with Just One Query Per Page

There are many sit­u­a­tions in which pag­i­na­tion of query results is very use­ful, espe­cial­ly for per­for­mance opti­miza­tion.  In most of these kinds of sit­u­a­tions, the pag­i­nat­ing of results requires you to deter­mine the total num­ber of results, so the appli­ca­tion knows the num­ber of pages available.

The most com­mon way to do this, is to use two queries; one which obtains the count of results and one which obtains a par­tic­u­lar page of results.  This method has the poten­tial to be even less opti­mal than load­ing the entire set of results, how­ev­er; due to the fact that two queries are now nec­es­sary, where­as before there was only one query.

In most data­base sys­tems, it is pos­si­ble to over­come this lim­i­ta­tion; though the tech­nique is spe­cif­ic to the par­tic­u­lar data­base you are using.  This exam­ple explains how to do this in MySQL.

Here is the sub-opti­mal exam­ple, using two queries; it is load­ing the first page of results. The LIMIT 0, 40 means it will start at posi­tion 0 (the begin­ning) and obtain a set of 40 results.

SELECT count(*) FROM my_table WHERE timestamp >= '2010-03-15' and timestamp <= '2010-08-01';
 
SELECT id, timestamp, field1, field2, field3 FROM my_table WHERE timestamp >= '2010-03-15' and timestamp <= '2010-08-01' ORDER BY id LIMIT 0, 40;

Here is a more opti­mal exam­ple, which uses two state­ments, only one of which is a real query.  Every­thing is done dur­ing the first state­ment, the sec­ond state­ment mere­ly loads the count, which was cal­cu­lat­ed dur­ing the first statement.

SELECT SQL_CALC_FOUND_ROWS id, timestamp, field1, field2, field3 FROM my_table WHERE timestamp >= '2010-03-15' and timestamp <= '2010-08-01' ORDER BY id LIMIT 0, 40;
 
SELECT found_rows() AS cnt;

One of the draw­backs of SQL_CALC_FOUND_ROWS, or count(*) in gen­er­al; is the fact that, by run­ning these cal­cu­la­tions, you lose some of the ben­e­fit of pag­i­na­tion. This is because your data­base is required to exam­ine all of the effect­ed data in order to gen­er­ate an accu­rate count.

Depend­ing on the specifics of your my.cnf con­fig­u­ra­tion file, the first state­ment will cache part of the infor­ma­tion, caus­ing it to exe­cute faster when sub­se­quent pages are loaded. In some of my own test­ing I have seen a sig­nif­i­cant speedup after the first page is loaded. 

If you want to get the most out of your appli­ca­tion, you will like­ly need to do a com­bi­na­tion of appli­ca­tion tun­ing, query tun­ing and data­base tun­ing. Gen­er­al­ly you will want to start by tun­ing the appli­ca­tion itself; this way you’re elim­i­nat­ing any bot­tle­necks inher­ent in the appli­ca­tion. Then you’ll need to do some query opti­miza­tions if the appli­ca­tion still isn’t fast enough. Final­ly, you’ll want to look into what con­fig­u­ra­tion changes you can make on the data­base in order to speed things up from the source.

Writing Complex Web Apps With Google Web Toolkit (GWT)

The Google Web Toolk­it (GWT) is a rel­a­tive­ly new set of open source tools, devel­oped by Google; which aims to allow devel­op­ers to write much of the client-side code as Java. This Java code is then com­piled into the appro­pri­ate JavaScript code, to run on the user’s web brows­er. Basi­cal­ly, the Google team has come up with a way of allow­ing devel­op­ers to write most of their web appli­ca­tions in Java, instead of hav­ing to switch between Java and JavaScript; thus min­i­miz­ing the amount of cross-brows­er JavaScript development/testing.

The devel­op­ers of GWT have cho­sen to focus their efforts on Eclipse as the pre­ferred IDE; though you are not lim­it­ed to Eclipse. One of the great ben­e­fits of GWT, is that you can now step through most of your appli­ca­tion in the Eclipse debug­ger. This makes devel­op­ing the client-side aspects of your app much eas­i­er and more sta­ble than hav­ing to use JavaScript debug­ging tools like Fire­bug.

Attached is a Google Tech Talk from Google devel­op­er Bruce John­son, in which he explains GWT in great detail. The video is a cou­ple of years old; but it is still a good intro to GWT.

Google Tech TalksJune 24, 2008


YouTube DirectE­clipse Day at the Google­plex: GWT in Eclipse 

Eclipse Day at the Googleplex

Speak­er: Bruce John­son, Google

Build­ing high-per­for­mance Ajax eas­i­ly with Google Web Toolk­it (GWT) in Eclipse has always been pos­si­ble, but soon it will be down­right easy. Bruce will present GWT’s upcom­ing Eclipse plu­g­in that helps novices get start­ed and lets experts fly.

Streaming Data as Downloadable Files in Struts Servlet

One way to stream data to the client, is to use the Print­Writer, a library which allows you to direct­ly manip­u­late the out­put stream which is sent to the client. One of the ben­e­fits of stream­ing the out­put to the client with Print­Writer, is the abil­i­ty to send data as it is gen­er­at­ed; instead of hav­ing to buffer all of the data on the serv­er, then send the data to the client after the entire set is generated.

For con­ve­nience and espe­cial­ly for large files, it is impor­tant to mod­i­fy the HTTP head­ers in HttpServle­tRe­sponse, instruct­ing the clien­t’s brows­er to save the file to disk.

The fol­low­ing is a min­i­mal  exam­ple, which shows how to dump a dum­my CSV text file as a down­load­able file in a struts Action Servlet.

public class CsvDataDumpAction extends Action {
 
	public ActionForward execute
	(ActionMapping mapping, ActionForm form, HttpServletRequest request, HttpServletResponse response)
	{
		// declare the PrintWriter object for later instantiation
		PrintWriter pw = null;
 
		// modify the HTTP response header, so the file is downloaded as DataDump.txt
		response.setHeader("Content-disposition", "attachment; filename=DataDump.txt");
		response.setContentType("application/octet-stream");
 
		// catch the IOException generated by the PrintWriter
		try {
			// Sample header with four fields
			String header = "Field1,Field2,Field3,Field4";
			pw.println(header);
			// flush the buffer, sending the header line to the client
			pw.flush();
 
			// generate 1000 lines of dummy test data, with the field name, followed by the number of the row
			for(int i = 0; i &lt; 1000; i++) {
				pw.println("Field1_"+i+",Field2_"+i+",Field3_"+i+",Field4_"+i);
				// flush the buffer after each line is generated,
				// sending the data to the client as it is generated
				pw.flush();
			} 
 
			// show stack traces for the PrintWriter in the logs
		} catch (IOException e) { 	e.printStackTrace(); 	}
		return null;
	}
}