Monday, December 14, 2009

Issue while upgrading the GrEclipse plug-in

I upgraded my GrEclipse plug-in today to a newer version. The upgrade URL is given in this page. Since it was an upgrade, I click on Help -> Check for updates in Eclipse. Immediately I got the following error:

Cannot complete the install because of a conflicting dependency.
  Software being installed: Groovy-Eclipse Feature 2.0.0.xx-20091211-1000-e35 ( 2.0.0.xx-20091211-1000-e35)
  Only one of the following can be installed at once:
    JDT Core patch for Groovy-Eclipse plugin 2.0.0.xx-20091017-2200-e35 (org.codehaus.groovy.jdt.patch.feature.jar 2.0.0.xx-20091017-2200-e35)
    JDT Core patch for Groovy-Eclipse plugin 2.0.0.xx-20091211-1000-e35 (org.codehaus.groovy.jdt.patch.feature.jar 2.0.0.xx-20091211-1000-e35)
... and a lot more ...
To recover from the error, I just uninstalled the version of JDT that was conflicting. To do this, click on Help -> Install New Software and click on the "What is already installed?" link at the bottom of the dialog. You can easily locate the old JDT and uninstall it.

Thursday, December 03, 2009

Hurray! I have installed 2 GB RAM in my laptop

Last night I upgraded the RAM installed in my laptop from 1 GB to 2 GB. I found the following YouTube video to be very helpful and informative.

Tuesday, December 01, 2009

Two performance tweaks to boost your Vista performance

I have a HP dv6000 laptop with Vista Home Premium. Since day 1, my machine was terribly slow for no reason. I tried so many tweaks that didn't seem to work at all. I installed Process Explorer from sysinternals tools to monitor which process is taking much resources. There were two things that gave me a good performance boost:
  1. Disable Windows defender.
  2. Disable the windows indexing service.
I have my own anti-virus software (Avira) which provides protection against much better protection than the Windows defender. But because both the defender and Avira were scanning each file and process, it effectively doubled amount of load to protect the system. So I disabled the defender. You can read how to turn off the defender here.

I don't need the indexing service offered by Windows. The indexer hogs so much resource and I have not used the search feature much. You can easily disable indexing by right clicking on the drive in Windows Explorer and un-checking the indexing for that drive. Make sure that change is applied on the drive and all the files and directories under that drive. You can read more about this here.

Another worthy exercise is to launch msconfig and inspect all the programs that are started during start up. You can read more about how to do this here.

If you still experience slow down, launch the Process Explorer and inspect which process is taking more CPU. You can press space bar and pause screen refresh and inspect all the processes that are running.

Everything you want to know about memory

Recently I have been looking around to upgrade the memory in my laptop. I was trying to understand what are the specs and how to find out the right RAM model for my laptop.

I found this article to be extremely useful. You can also read a follow up article that talks about the details of the RAMs used in GPUs.

Also this article was helpful in understanding the FSB speeds and bus speeds in general.

Tuesday, November 17, 2009

Reloading inittab without reboot

Use "/sbin/init q" if you made any changes to the inittab file and would like init to reload that file.

Thursday, November 12, 2009

A note on Java's Calendar set() method

Remember that the Calendar's internal fields include year, month, date, hour, minutes, seconds, milliseconds and time zone. Whenever you are calling a set() method with multiple fields, like set(year, month, date), it will not affect the rest of the fields.

Remember that there is no set() method with multiple fields available to set the milliseconds. If you would like to set the milliseconds, you must use set(Calendar.MILLISECOND, value). Likewise, if you are planning to set all the fields, its a good idea to reset all the fields using clear() method. This will clear milliseconds as well.

Most of the times, millisecond field may not be of interest to you. But if you are going to use the UTC milliseconds, by calling getTimeInMillis(), then make sure you set the right values for milliseconds as well.

Sunday, November 01, 2009

Netcat for windows

I was troubleshooting an issue with one of the clients. And I desperately needed them to run the netcat tool and send me the output. But their OS is windows. So I was looking for the netcat tool for windows. Here it is:

Thursday, September 03, 2009

Eclipse - issue with setting break points

I recently ran into a weird issue. I had to debug a piece of code that I wrote. So I launched the application in Eclipse in debug mode, and set a few break points. Though I can see that the log messages related to all the break points appear in the log, it didn't stop in all the break points. In some break points it stopped and in some other it didn't stop. I checked the output directory, I checked the flags to the compiler, even I downgraded to Ganymede. Nothing seemed to work.

When I asked the question in the, I got the answer in five minutes. Looks like JDK 1.6 update 14 has an issue with debugging. So upgrading to JDK 1.6 update 16 helped. But still I am seeing the issue occasionally.

Saturday, August 29, 2009

CircuitCity online is open for business again

CircuitCity online is open for business again. They filed for bankruptcy in Nov 2008 and closed both online and stores.

Comparison of SSD and drum-based HD (HDD)

I am in the process of assembling a PC for my friend. So I was gathering information on various components. I thought I must share what I learned about hard disks.

The latest breed of hard disks in the PC segment are called Solid State Disks (SSD). Though the technology is not new, the price has become affordable in the recent few years. Due to their formidable high prices earlier, computers were usually shipped with the drum-based hard disks (I will refer to them as simply HDD). You can read about SSDs from wikipedia.

These are the points I wanted to share:
1) The transfer speed of SSD is faster compared to HDD. A good 7200 RPM hard disk usually has a transfer rate around 70 MBytes/sec. Where as SSD has a transfer speed around 200 MBytes/sec. Remember I am giving an approximate figure and speeds vary for read and write. One of the recent additions to the hard disks is 10000 RPM hard disks. I found them to be faster than SSD. For e.g. Western Digital's Velociraptor has a peak transfer speed of 384 MBytes/Sec.

2) The life time of SSD is very good. The Mean Time Between Failure for an SSD is in millions of hours. For e.g. OCZ's 60 GB SSD is having an MTBF of 1.5 million hours. So SSDs are more robust.

3) The shock tolerance of the SSD is substantially higher. For e.g. OCZ's 60 GB SSD's max shock resistance is 1500G. But for Velociraptor, the same value is 300G.

4) The SSDs make less noise compared to HDD as there is no mechanical parts.

5) One of the cons of the SSDs is the price per bit of SSD is still substantially higher. For e.g. a OCZ's 60 GB hard disk costs $219, where as the Velociraptor 300 GB costs $229.

Whats my advice? If you are serious about the life of the hard disk, try SSD. If you are crazy about the speed, try Velociraptor.

What have I decided? I have decided to buy the SSD and to add an external hard disk with eSATA interface. I have decided to buy Iomega Prestige 1TB.

Friday, July 31, 2009

Ubuntu from USB drive

I wanted to try out running Ubuntu from my USB drive (2GB). While I was searching for the right software, I came across UNetbootin utility. This utility is a lot simpler to use than the method provided in the Ubuntu site, a method that makes use of Win32 disk imager. The best part is, UNetbootin provides you options to create bootables from other operating systems like FreeBSD, NetBSD and other notable flavors of Linux.

UNetbootin is not an installable, its just one .exe file. You download the exe file and start it. It will prompt you for the version of OS you would like to burn in your USB. Once you select, it will automatically download the ISO image file and burn it for you. In case you happen to have the ISO image file available on your local hard drive, you can provide that path too.

Eclipse icons

Here is a complete list of icons that are used in Eclipse. It is very useful to understand all these icons as most of the time in the outline views only the icons are used to crisply denote what each member/method stands for.

Wednesday, July 29, 2009

Good JavaScript frameworks

This page provides you a list of top 5 JavaScript frameworks. Though I am not sure if they are the top 5, which is always arguable, I am very sure that those frameworks are good and useful.

Saturday, July 11, 2009

VirtualBox 3.0 freezes bug has been fixed

If you had been following my blog, I posted an issue earlier with VirtualBox 3.0 that under network load it freezes. That bug has been fixed in release 3.0.2. Please refer to the release notes of 3.0.2 for an update. I verified the fix by reinstalling the 3.0.2 version.

Troubleshooting network issues in VirtualBox

VirtualBox comes with built-in facility to trace the network packets. To enable tracing, you must first shutdown your VM. Then find out the VM's name or uuid. Please refer to my earlier post to find how to find uuid. Once you have the uuid of the VM, give the following command:

vboxmanage modifyvm a27473c0-d690-4c95-a48a-1c49d69a20e6
--nictrace1 on --nictracefile1 c:\temp\nictrace1.pcap

Replace my uuid with yours. I am enabling tracing on the NIC1. You will have to change the parameters appropriately if you would like to trace some other NIC. To make sure that your VM's tracing is enabled on the specified interface, give the following command:

vboxmanage showvminfo a27473c0-d690-4c95-a48a-1c49d69a20e6
VirtualBox Command Line Management Interface Version 2.2.4
(C) 2005-2009 Sun Microsystems, Inc.
All rights reserved.

Name: Ubuntu8.10
Guest OS: Ubuntu
UUID: a27473c0-d690-4c95-a48a-1c49d69a20e6
Config file: C:\Documents and Settings\xxxx\.VirtualBox\Machines\Ubuntu8.10\Ubuntu8.10.xml
Memory size: 512MB
VRAM size: 12MB
Boot menu mode: message and menu
ACPI: on
PAE: off
Time offset: 0 ms
Hardw. virt.ext: on
Nested Paging: off
VT-x VPID: off
State: powered off (since 2009-07-11T18:45:41.000000000)
Monitor count: 1
3D Acceleration: off
Floppy: empty
SATA: disabled
IDE Controller: PIIX4
Primary master: D:\xxxx\VM\Ubuntu8.10-desktop-xVM\VDI\ubuntu-8.10-x86.vdi (UUID: 58a58064-e589-47c0-baba-de38fa62
DVD: C:\PROGRA~1\Sun\XVMVIR~1\VBoxGuestAdditions.iso (UUID: 60b22ad4-50d1-4fb7-8097-862ce763df52)
NIC 1: MAC: 080027AC3BCE, Attachment: NAT, Cable connected: on,
Trace: on (file: c:\temp\nictrace1.pcap)
, Type: Am79C973, Reported speed: 0 Mbps
NIC 2: disabled
NIC 3: disabled
NIC 4: disabled
NIC 5: disabled
NIC 6: disabled
NIC 7: disabled
NIC 8: disabled
UART 1: disabled
UART 2: disabled
Audio: enabled (Driver: DSOUND, Controller: AC97)
Clipboard Mode: Bidirectional
VRDP: disabled
USB: enabled

Once done, start your VM. The trace would be written to the file you have specified. You can use tcpdump or Wireshark to analyze the packets.

Make sure you disable the trace once you are done. Give the following command to disable tracing:
vboxmanage modifyvm a27473c0-d690-4c95-a48a-1c49d69a20e6 --nictrace1 off

The only limitation of using the trace facility is that you cannot turn on or off tracing while the VM is up and running. Thats a very serious thing if you run into some network issue in the middle of a session in your VM.

Thursday, July 09, 2009

Line color in PlotKit

I am experimenting with PlotKit tool. So far I am liking the simplicity of the library and the good documentation.

I was making use of the SweetCanvasRenderer to draw a line graph, as I was strictly following the quick start example. I wanted to draw two line graphs in the same canvas, each with different colors. But with the SweetCanvasRenderer it is not possible to change the line color, as it always overrides the strokeStyle property in the Context with white color. See the code below from SweetCanvas.js:
context.strokeStyle = Color.whiteColor().toRGBString();
If you would like to make use of SweetCanvasRenderer, still would like to have different colors for each of the lines, you will have to do it with little changes to the SweetCanvas.js file. Or you can consider making use of the BasicCanvasRenderer.

VirtualBox 3.0 freezes under network activity

*** Update on 07/11/2009 ***
The issue below has been marked as fixed. Refer to the release notes of 3.0.2 for a note on what has been fixed. I think now you can happily move back to VirtualBox 3.0 :-)

After installing the VirtualBox 3.0, I ran my Ubuntu under that. When I ran update manager, my VM kept freezing. I realized that not only with update manager, even with Firefox trying to download a huge file, the VirtualBox freezes. Looks like this a known issue and there is a bug artifact for this one. As of now, there is no fix available and there is no ETA as well when the fix will be available.

So I have happily reverted back to VirtualBox 2.2.4 version. I don't see that issue anymore.

Wednesday, July 08, 2009

Doing "View source" for JavaScript generated page

By JavaScript generated page I mean a page that is mostly constructed using JavaScript's document.write(...) statements. This is mostly the case when you make use of UI frameworks like YUI.

Recently one of my friends approached me to help him with trouble shooting a page which was developed using YUI. We spent some time trying to figure out how to view the source of the page that we were viewing. When we did a right click and "View source" in Firefox, all we saw was a bunch of JavaScript sources being included. Nothing more.

So here is how you can easily view the source. Install "Execute JS" plug in. Then load the page you would like to view source. Then you click on "Tools -> Open Execute JS". Then check the "Content Window" and choose the title of the window that you would like to view the source for. Now in the "JS-Code to execute", just type the following:
Then click on "Execute". You will find the HTML Source of the page displayed in the bottom pane. Execute JS is pretty powerful that you can type any arbitrary expression and execute.

Hope that helps.

VT-x is not available error in VirtualBox 3.0

After installing the VirtualBox 3.0, when I started my Ubuntu VM, I got the error shown in the image on the right side.

The reason for this error is explained in this discussion thread. To recover from this error, follow these steps below:
  • Click on the OK button and come to the VirtualBox manager.
  • Click on the VM that threw this error and click on the System.
  • Click on the Processor tab and reduce the number of CPUs to 1.
  • Click on OK and save your changes.
  • Now start your VM by selecting it and clicking on Start button or just by double clicking on the VM.
Though the VirtualBox can support multiprocessor guest operating systems, to enable it you must have VT-x support from the underlying processor.

Monday, July 06, 2009

bashreduce - a MapReduce system using command line tools

I came across this interesting reading in Linux Magazine about bashreduce. Sounds interesting. You will also find it useful to read about Richard Crowley's extensions to bashreduce.

KSplice - Thats what I had been looking for

The time has come to apply any critical kernel patches without rebooting the system. The tool that does this magic is called ksplice and shipped as a part of Ubuntu system (Jaunty). It would be a valuable part of any system that cannot afford to take a downtime. To learn more about ksplice, please read this Linux Magazine article.

Saturday, July 04, 2009

Why Python has both lists and tuples?

There is a FAQ entry that explains why Python has both list and tuple data types and what is the key difference between these two types.

Saturday, June 27, 2009

Adding META tag to your blogger page layout

When I wanted to submit my blog for Google AdSense, I had to first confirm the ownership to my blog. It can be proved in two ways: either by adding a META tag in the HEAD section or by adding a HTML file in the site. Since I have my blog in the, the easiest way for me to confirm my ownership is to add a META tag in my blog layout.

It is very easy to add this tag. Follow these steps.
  1. Login to your blogger account
  2. Click on Layout and then Edit HTML
  3. Search for the following phrase: all-head-content. This is the line that tells the blogger engine as to what needs to be included in the HEAD tag.
  4. Just before this line, not just before all-head-content, but before the line, include your meta tag. Most likely you will be including a META tag that looks like:
    <meta name='verify-v1' content="Some base64 encoded string"/>
  5. Save your changes and view your blog and confirm that your new META tag appears on the header.
Remember that though you specify name and content, when when blogger serves the page it will appear as content and name. Don't get confused. Another common issue is make sure this META tag appears as the first tag in the HEAD section as the Google doesn't recognize if it appears later in the HEAD section.

Tuesday, June 16, 2009

The mysterious ORA-03111 error

Recently one of the applications that I developed started throwing exceptions, that had the following message:
SQL state [72000]; error code [1013]; ORA-03111: break received on communication channel
When I googled around, I couldn't come across anything useful. Sadly enough most of the sites just showed the documentation for that error, without any explanation from anyone experiencing that issues. So here you go, with the best possible explanation that I could come up with.

My application sets two things on the connection that is throwing this exception:
  • It sets the fetchSize to be 2500 rows
  • It sets the query timeout to be 10 seconds
The database server and the application are separated over a long latency network (actually there is a NetEm box that emulates the long latency between these two boxes) which has a latency characteristic of 50+/-5 milliseconds. This is the whole setup.

It is important to understand how the timeout is handled by the Oracle client (in my case JDBC client). Once the query is successfully submitted, the client starts a clock for the timeout. Once the timeout is reached, the client sends an URG message to the Oracle server. The moment Oracle server receives this message, it knows that the client wants to cancel the operation that it was carrying on, no matter what stage the operation is in.

So take a couple of cases. Assume the operation is a SELECT query that will result in 10000 rows. If the Oracle server hasn't even started fetching the results, most likely the client's request would be responded immediately with an error code ORA-01013, which has a description like:
SQL state [72000]; error code [1013]; ORA-01013: user requested cancel of current operation
But imagine the server has fetched the rows and in the process of pumping the resultset back to the client. If the client requests the Oracle server to cancel the operation while still there is pending data in the socket to be delivered, it just adds the ORA-03111 packet at the end of the pending packets and lets the client knows that the operation has been cancelled while there is pending data to be delivered.

Look at the tcpdump output below:
23:13:08.613007 IP jdbc_client.48681 > orcle_server.1521: P 2543:3174(631) ack 2342 win 11908
23:13:18.635068 IP jdbc_client.48681 > orcle_server.1521: P 3174:3175(1) ack 265693 win 65535 urg 1
23:13:20.472561 IP orcle_server.1521 > jdbc_client.48681: P 398520:398615(95) ack 3186 win 65535
0x0000: 0015 c5ec 12a8 0021 1c1d c0c3 0800 4500 .......!......E.
0x0010: 0087 5023 0000 3406 bad6 c0a8 fd9a c0a8 ..P#..4.........
0x0020: fc8b 05f1 be29 a792 459f a091 06cc 5018 .....)..E.....P.
0x0030: ffff 67c2 0000 005f 0000 0600 0000 0000 ..g...._........
0x0040: 0402 04e3 0203 f500 0001 0300 0300 0000 ................
0x0050: 0000 0000 0000 0000 0000 0001 0100 0000 ................
0x0060: 0033 4f52 412d 3033 3131 313a 2062 7265 .3ORA-03111:.bre
0x0070: 616b 2072 6563 6569 7665 6420 6f6e 2063 ak.received.on.c
0x0080: 6f6d 6d75 6e69 6361 7469 6f6e 2063 6861 ommunication.cha
0x0090: 6e6e 656c 0a nnel.

Pay special attention to the times when the SELECT query was sent (21:13:08) and when the cancel request as an URG packet was sent (21:13:18), and when the Oracle sends the last TNS packet that has the error code ORA-03111 (21:13:20).

The cancel request as an URG packet was sent after 10 seconds because as I mentioned earlier my query timeout is 10 seconds.

So now the million dollar question: What should I do if I am facing this issue in my application?

Follow these simple steps:
  • First make sure that your query can be completed within the timeout that you have specified. If you consistently face this exception, try increasing your timeout.
  • That might help to get rid of the exception, but not the root cause. The root cause usually is a database that is not optimized for the query that you are executing or a bad network.
  • To find out if its the database that is the issue, try executing the same query in a host closer to the network. Or try executing the same query hitting the database from a different network. If you are convinced the database is the issue, try to tune it.
  • To find if it is the network that is having the issue, try to do a tcpdump and analyze if there are any out of order deliver of packets. Or dropped packets. If yes, then try to fix the network.
In my case, it turned out to be the bad configuration in the NetEm that was causing too many packets to be delivered out of order and too many duplicated packets. Remember I was introducing a variance of 10 ms (i.e. my packets could be delayed anywhere from 45 ms to 55 ms, as per my configuration). In real cases, at least in a well maintained production network, the variance will not be more than 1 ms.

Since I am not an expert in Oracle, I would be happy if anyone reading this blog entry has something to add on top of what I have told here. And I sincerely believe that this posting would help whoever is facing this issue.

Saturday, June 13, 2009

Availability calculation

I wanted to understand how the availability is calculated for systems. I came across an excellent article and found it very useful. You can read it from here.

Tuesday, June 09, 2009

Yet another top 10 list for Firefox

But this one is useful. The title of the blog entry is "Top 10 Firefox plugins for web developers & designers".

Effective use of Timeouts

It is very important to pay special attention to the timeouts and set them with proper values for any blocking operation. Any good library API must provide a way to set the timeout for any blocking operation.

Recently we identified one of the issues in production network because one of the programs that we deployed in production had set right timeout value. I just thought I must reiterate the fact that its critical to set the timeout values to optimum level so that we can fail fast and catch issues.

Tuesday, May 12, 2009

Monday, May 11, 2009

Difference between function, macro and special form in Lisp

A very good description of the differences between function, macro and special forms in Lisp. Read all the way through the end of the email thread.

Saturday, May 09, 2009

A list of WYSIWYG editors for content mangement

There are times when you would like to have a text area in your web site that resembles like a word processor's page, with a lot of controls to format your text. If you have worked with any content management system (CMS) or a wiki, you have most likely come across one. I was looking for a WYSIWYG editor and came across a few of them. Thought I sould share the list with everyone so that you don't have to waste your precious time. The list below is not in any particular order.

  1. FCKEditor. Looked like the editor that had the most features. This site provides a demo and a very good documentation as well.
  2. TinyMCE. Wherever people talk about WYSIWYG editor, they invariably mention both FCKEditor and TinyMCE. Provides a demo and by looking at the source of demo you can find how easy it is to integrate this editor.
  3. Xinha.They have an excellent demo page that allows you to customize the editor on the fly and see.
  4. Kupu. Kupu is from Open Source Content Management team. I could not see a demo anywhere (the site provides two links to Plone and Silva, unfortunately both are broken). I could only see a few screenshots.
Hope that helps for someone on the lookout for the right editor.

Friday, May 08, 2009

Django tutorial

I manged to complete the three parts Django tutorial today. I haven't worked with many web frameworks before. I have a very good experience in CGI and some flaky experience in JSP and PHP. I liked Django for the reason that they coupled the model and the database and provide the ability to almost worry nothing about writing database queries. I don't know how much more time it would have taken if I had written the poll application myself in either JSP or CGI. Especially setting up the database and the tables.

Now that I have first hand experience in developing application in Django, I am yet to see the performance under reasonable and heavy loads. I am also yet to discover how Django is deployed in production. As per my understanding, the design of the Python interpreter itself is not suitable for multi-threaded applications. The best way to scale Python is not to create a multi-threaded application, but to have more instances of Python running with some load balancing server in front of these applications.

Thursday, May 07, 2009

Mounting folder shares in xVM

In xVM, you can access folder shares from the host system in your guest Linux system. The prerequisite is to install all the guest additions. The following is the process that I followed to install guest additions and mount the shares in my Ubuntu desktop (Linux ubuntu-desktop 2.6.24-24-generic).

Installing guest additions is very easy. Once you start your guest OS, you click on the "Devices -> Install Guest Additions" in the xVM. Then it will mount a CD ROM for you. Once the CD ROM is mounted, just let it auto run or you can run the appropriate shell script from the mounted CDROM. Usually it is mounted under /media/cdrom. So doing an ls under that directory should tell you which shell script you need to run.

To create a share, go to "Devices -> Shared Folders". Click on the add share icon (the one that has "+" in it). Select the folder you would like to share with the guest and then give it a name. It is preferable to give a short name for the share. Let us say you give the name shared-folder. Depending on your need, you can mark it read-only. If you would like to make the share permanent across multiple invocations of the xVM, you can click on the "Make Permanent" too. Once this is done, click on "OK" and close this box.

From your guest, give the following command (I am mounting under /tmp/shared-folder) as root:
mkdir /tmp/shared-folder
mount -t vboxsf shared-folder /tmp/shared-folder
Thats it. You can access all the files under the shared-folder shared from the /tmp/shared-folder directory. BTW, there is no need that the share name and the mounted directory name should be the same.

For details, please refer to the user manual that comes with the xVM.

A useful regular expression tutorial

This is one of the most useful regular expression tutorials I have seen in the recent times. The tutorial is not only clear and concise, it just hits what is actually needed to be fully productive.

BTW, the tutorial was written by a high school student!

xVM's vboxmanage.exe command

You can view and edit the configuration of your guest OSs using the vboxmanage command. To view the list of guest OSs you have, give the following command:
C:\Program Files\Sun\xVM VirtualBox>VBoxManage.exe list vms
VirtualBox Command Line Management Interface Version 2.2.2
(C) 2005-2009 Sun Microsystems, Inc.
All rights reserved.

"MySolaris" {dc6dc85f-5583-4a1b-bf3e-969941a2cd91}
"Ubuntu" {eb973bbf-d86e-4579-85eb-6ea2cd12bf95}
"Debian" {3e784597-89d8-4f17-90cb-63e866c651a3}
"openSUSE" {6862884e-60e1-4c65-8aab-b57ec38a3922}
"Debian-Lenny" {4b561432-abac-4a27-a501-b42af956b96b}

To view the specific guest, you can use the UUID of the guest. For e.g.
C:\Program Files\Sun\xVM VirtualBox>VBoxManage.exe showvminfo eb973bbf-d86e-4579-85eb-6ea2cd12bf95
VirtualBox Command Line Management Interface Version 2.2.2
(C) 2005-2009 Sun Microsystems, Inc.
All rights reserved.

Name: Ubuntu
Guest OS: Ubuntu
UUID: eb973bbf-d86e-4579-85eb-6ea2cd12bf95
Config file: C:\Documents and Settings\roy\.VirtualBox\Machines\Ubuntu\Ubuntu.xml
Memory size: 512MB
VRAM size: 12MB
Boot menu mode: message and menu
ACPI: on

Remember a couple of things:
1) If you would like to view the guest OS by giving its alias, it is case sensitive. For e.g.
C:\Program Files\Sun\xVM VirtualBox>VBoxManage.exe showvminfo ubuntu
VirtualBox Command Line Management Interface Version 2.2.2
(C) 2005-2009 Sun Microsystems, Inc.
All rights reserved.

ERROR: Could not find a registered machine named 'ubuntu'
Details: code VBOX_E_OBJECT_NOT_FOUND (0x80bb0001), component VirtualBox, interface IVirtualBox, callee IUnknown
Context: "FindMachine (Bstr(VMNameOrUuid), machine.asOutParam())" at line 1921 of file VBoxManageInfo.cpp
But the same command worked fine when I gave "Ubuntu" (without quotes) as argument.

2) If you would like to specify UUID, rememeber not to include curly braces ("{}") .

To know all the available options, you can just give the vboxmanage command. That will list you all the available operations.

Wednesday, May 06, 2009

Installing matplotlib - the hard way

I recently installed matplotlib from the source. It was quite an experience that I thought I would share my experience so that others don't have to waste time searching how to do that. So here it is!

What is matplotlib?
Matplotlib is a libarary to plot figures from your Python program. It has much more features than just plotting. You can read more about that from the library's home page.

If you are planning to install matplotlib on Linux everything from the source by building everything yourself, this guide is for you. Please read on.

You will have to install the dependencies first before you can install matplotlib. Matplotlib depends on numpy, zlib, libpng and FreeType libraries. You can get the full dependency list (including the optional dependent libraries) from here. Let us see how to install each one of these components.

Phase 1: Installing numpy
Numpy requires the same Fortran compiler that was used to build blas. There are two flavors of Fortran compilers possible: f77 or gfortran. Unfortunately, these two are not ABI compatible, hence make sure you identify which version you are looking for. It is easy to identify the Fortran compiler used by using ldd on the blas library:

roy@roy-debian:~$ ldd /usr/lib/ => (0xb7f51000) => /usr/lib/ (0xb7e09000) => /lib/i686/cmov/ (0xb7de3000) => /lib/ (0xb7dd5000) => /lib/i686/cmov/ (0xb7c7a000)
/lib/ (0xb7f52000)

Hence in my case I know that it is gfortran that I should be making use of. Install gfortran, by following the steps below:
aptitude search gfortran
sudo apt-get install gfortran-multilib

Then download the Numpy from here. Once you have downloaded, untar the tar file and follow the steps below:
roy@roy-debian:~/Desktop$ tar xvfz numpy-1.3.0.tar.gz
roy@roy-debian:~/Desktop$ cd numpy-1.3.0/
roy@roy-debian:~/Desktop/numpy-1.3.0$ python build --fcompiler=gnu95
roy@roy-debian:~/Desktop/numpy-1.3.0$ sudo python install

Once you are done with compiling and installing numpy module, you come out of the numpy-1.3.0 directory and check if everything is okay:
roy@roy-debian:~/Desktop/numpy-1.3.0$ cd ..
roy@roy-debian:~/Desktop$ python
Python 2.5.2 (r252:60911, Jan 4 2009, 17:40:26)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.version.version

Phase 2: Installing zlib and libpng
You can downlod the zlib from here. This is required for libpng. Once you download the zlib, do the following:
roy@roy-debian:~/Desktop$ tar xvfz zlib-1.2.3.tar.gz
roy@roy-debian:~/Desktop$ cd zlib-1.2.3/
roy@roy-debian:~/Desktop/zlib-1.2.3$ ./configure
roy@roy-debian:~/Desktop/zlib-1.2.3$ make test
roy@roy-debian:~/Desktop/zlib-1.2.3$ sudo make install

Once this is done, you are ready to install libpng. You can download libpng from here. Once you have downloaded, follow similar instructions:
roy@roy-debian:~/Desktop$ tar xvfz libpng-1.2.35.tar.gz
roy@roy-debian:~/Desktop$ cd libpng-1.2.35/
roy@roy-debian:~/Desktop/libpng-1.2.35$ ./configure
roy@roy-debian:~/Desktop/libpng-1.2.35$ make check
roy@roy-debian:~/Desktop/libpng-1.2.35$ sudo make install

Phase 3: Installing FreeType library
You can download FreeType library from here. Then follow the instructions below:
roy@roy-debian:~/Desktop$ tar xvfz freetype-2.3.9.tar.gz
roy@roy-debian:~/Desktop$ cd freetype-2.3.9/
roy@roy-debian:~/Desktop/freetype-2.3.9$ ./configure
roy@roy-debian:~/Desktop/freetype-2.3.9$ make
roy@roy-debian:~/Desktop/freetype-2.3.9$ sudo make install

Final phase: Installing matplotlib
Oooh ... Now we come the final phase of our installation. Yes, we are actually going to install matplotlib.

You can download the matplotlib from here. Once you have downloaded the tar file, follow the instructions below:
roy@roy-debian:~/Desktop$ tar xvfz matplotlib-
roy@roy-debian:~/Desktop$ cd matplotlib-
roy@roy-debian:~/Desktop/matplotlib-$ python build
roy@roy-debian:~/Desktop/matplotlib-$ sudo python install

Testing the installation
Well, that was quite a long process. Now let us check if everything went well.

roy@roy-debian:~/Desktop$ python
Python 2.5.2 (r252:60911, Jan 4 2009, 17:40:26)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import matplotlib
>>> matplotlib.use('Agg')
>>> import matplotlib.pyplot as plt
>>> plt.plot(range(10))
>>> plt.savefig('myfig')
>>> exit()

'Agg' backend saves the file in the png format. (Remember we installed the libpng not long time ago!) To know more about back ends, please refer here. Once you come out of Python shell, you can view myfig.png by giving the following command:
gimp myfig.png

For some reason, if you don't have gimp installed, you can make use of firefox or iceweasel to view the png file file.

Friday, May 01, 2009

Paper on garbage-first (G1) garbage collector

This is the paper on garbage-first (G1) garbage collector. It is an interesting read.

A better way of printing heap usage in Java

If you rely on getting the heap usage by methods provided in Runtime, then consider making use of the MemoryPoolMXBeans. The code to print the memory usage is extremely simple:
List mpool = ManagementFactory.getMemoryPoolMXBeans();
for(MemoryPoolMXBean b:mpool) {
System.out.println(b.getName() + ": " + b.getUsage());
You will see something like this when you run this:
Code Cache: init = 163840(160K) used = 468672(457K) committed = 491520(480K) max = 33554432(32768K)
Eden Space: init = 917504(896K) used = 202792(198K) committed = 917504(896K) max = 4194304(4096K)
Survivor Space: init = 65536(64K) used = 0(0K) committed = 65536(64K) max = 458752(448K)
Tenured Gen: init = 4194304(4096K) used = 0(0K) committed = 4194304(4096K) max = 61997056(60544K)
Perm Gen: init = 12582912(12288K) used = 108360(105K) committed = 12582912(12288K) max = 67108864(65536K)
Perm Gen [shared-ro]: init = 8388608(8192K) used = 6162160(6017K) committed = 8388608(8192K) max = 8388608(8192K)
Perm Gen [shared-rw]: init = 12582912(12288K) used = 7282024(7111K) committed = 12582912(12288K) max = 12582912(12288K)
This is a lot more informative than the information provided by the Runtime. This works only in versions 1.5 or later.

Thursday, April 30, 2009

A good reference on processor cache

I found this good reference on processor cache while I was looking for some material to refresh my cache knowledge. Hope this is useful for you too.

Friday, April 17, 2009

How to measure time to execute a query when using DBCP+JdbcTemplate

Keep the following things in your mind if you are planning to measure the time taken to execute a query when you make use of Apache DBCP + Spring JdbcTemplate.

You will never be able to find out the time taken to execute a query in the database by measuring the the time it takes to complete any variant of JdbcTemplate.query...() or JdbcTemplate.execute...() . Making use of any of these functions involve the following steps:
  1. Time taken to borrow a connection from the pool
  2. Time taken to create a statement from the connection and bind the parameters if needed
  3. Time taken to send the query to the database
  4. Time taken to execute the query in the database (this is the time you are specifically interested in)
  5. Time taken to receive the result from the database (depending on the fetch size you have set, this will vary). For UPDATEs, only the update count is sent back from the database.
  6. Time taken in the row mapping (if you are making use of RowMapper) or time taken to consume all the rows from ResultSet (if you are making use of ResultSetExtractor)
  7. Time taken to return the connection to the pool
If you are sending the request over a long latency line, the time you are measuring is bound to be off by a few milliseconds due to steps 3 and 5. For e.g. to reach from west coast to east coast in US, it might takes 50 ms or more.

If you are making use of a connection pool that has all the connections borrowed all the time, then the time taken in steps 1 and 7 are significant. For e.g. during my experiments I have seen some threads waiting as long as 100 ms when there are 10 connections and 23 threads using the connection pool.

So the most reliable way of measuring the time spent in the database will involve the following things:
  1. Make sure you understand the average round-trip time between the database and your application. You will have to use this one for your analysis. You can use ping to measure this. If ICMP is not allowed in your network, make use of netcat.
  2. Make use of StatementCallback since it will help you in measuring time taken from steps 2 to 6 above. In other words the measurement will not be skewed by the DBCP borrow and return.
  3. Make use of PreparedStatementCreator variants of query() so that you will know the time when the statement has been created.
  4. If you are making use of RowMapper, the approximate time the query completed is when you start processing the first row. If you are making use of ResultSetExtractor, the time when extractData is called is the approximate query completion time. If you are making use of RowCallbackHandler, make use of ResultSet.isFirst() to decide the approximate query completion time.
  5. There is no way of excluding the time taken in DBCP when making use of batchUpdate().
  6. There is no way of excluding the time taken in returning a connection back to the pool when making use of update() functions.
Hope these guidelines will be useful to you. I followed them when I had to measure the time difference between the database and the host where my application was running. It is very tricky in the presence of the DBCP since borrowing and returning always skews the numbers.

Wednesday, April 01, 2009

Is my DBCP configuration bad or my database slow?

Let us say you are making use of DBCP in your application. You are running a load test on your application and you find that the response time is terrible. You suspect that it is the database that is messing up all the performance. Well, as an application developer that would be my first response! But how do I prove that it is the database that is causing the issues, and not my application. The first thing you should be checking is the number of load testing client threads versus the number of connections in your DBCP. If you have more number of client threads than the number of connections in your DBCP, there is a likelihood of your application being the root cause for the bad response.

The easiest way to confirm this one is by using the maxWait parameter of DBCP. You just set the maxWait to a value that you expect the average waiting time to be. For e.g. in my case I never expect the wait time to be more than 30 milliseconds. Once you set this watch for errors in your application logs. You will find that a lot of the following exceptions:
org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object

This simple yet powerful experiment will help you identify where the long running transactions are spending most of the time, waiting in the queue for getting a connection or executing in the query in the database.

On a side note, I would like to point out that the queuing policy of DBCP is unfair. Meaning, there is no guarantee that the first request will get a connetion from the pool first. This may be because of the fact that the GenericObjectPool might rely upon the Object.notifyAll() to notify the waiting threads.

Thursday, March 26, 2009

Saving a file being viewed in less

Sometimes I have a long sequence of command line pipes that end in less command. For e.g.
cvs log main.c | egrep "^revision " | less
So what is the best way to save the contents that are being viewed in less? You cannot view those contents by typing the letter "v". If you do so you will get an error message saying "Cannot edit standard input (press RETURN)".

The solution is simple. First type "g" to go to the beginning of the file. Then type "|". Then you will be prompted for the "mark:" and enter "$" (which denotes the end of file). Then you can enter a command for which the entire contents of less will be piped. For e.g. I will give the following:
cat > /tmp/savefile.txt
Thats it! You have saved the contents to the file /tmp/savefile.txt. Now you have a problem. Since you have moved to the beginning of the contents, you might want to go back to the same place where you were before saving the contents. Just type "''" i.e. single quote twice. You should be back at where you were before you started saving the file.

Hope this little trick saves a lot of time for you! If yes, please drop me a word :-)

Monday, March 23, 2009

So you think you understand misfireThreshold?

One of the most critical and least documented features of Quartz is the concept of misfiring and how misfiring is handled. If a trigger was supposed to carry out a job a certain time, and for some reason it was not carried out at that time, it is a misfired trigger. But not any trigger could be treated as a misfired trigger. For e.g. if a trigger is long due for 2 hours from its actual firing time, does it make sense to treat that as a misfired trigger? To make that decision Quartz scheduler relies upon a configurable parameter called misfireThreshold. It is specified in milliseconds. Whenever a trigger is due for misfireThreshold or lesser amount of time, it will be treated as a misfired trigger and triggered depending on the misfire policy specified in the job detail.

For e.g. on a certain situation you might want a misfired trigger to be fired immediately, on a certain other situation you might decide its okay to reschdule the same trigger to the next firing time. For SimpleTrigger, there are five policies available and all of them are documented as a part of the API here. They are pretty much self explonatory.

It is critical to understand the default value of misfireThreshold is 30 seconds. As an example, consider you have a trigger that gets fired once in every 1 second to carry out a job that cannot be concurrently executed. The job usually takes less than a second to complete, but on a certain instance it takes around 20 seconds to complete. As soon as that job is complete, you will see that there are 19 times the same job was fired immediately! This is because, all the 19 times when the trigger must have fired qualify as misfired triggers by the default value of misfireThreshold.

One of the bad things that could happen when the above mentioned scenario happens is there are a large number of triggers that need to be carried out within a short interval of time. This might not be a desired thing, also it introduces starvation at times. I have observed that the misfired trigger gets executed repeatedly blocking the execution of other triggers!

My suggestion is to keep the misfireThreshold to be half of the repeat interval. Of course, it depends on your application too. But in general, this might be good enough. Remember that misfireThreshold is common acros all the triggers, where as the repeatInterval is for each trigger. So if you have various triggers with differing repeat intervals, decide what is the optimum value for your case.

One thing I am repeatedly assured about using Open Source Software is never think that the default configuration parameters are fit for most of the cases. Some of my situations have required tuning using often least documented or least understood parameters!

Wednesday, March 11, 2009

More comments on JDBC batch update return codes

This is a quick follow up to my earlier post on JDBC batch update return codes. There are two constants that you can conveniently make use of to check the error codes available in java.sql.Statement:
SUCCESS_NO_INFO - The UPDATE or INSERT corresponding to this input values has succeeded but the number of rows that it affected is unknown.
EXECUTE_FAILED - The UPDATE or INSERT corresponding to this input values has failed.

Wednesday, March 04, 2009

Eclipse CVS error

Today while I was trying to synchronize my code with CVS repository and I got the following error:
Problems reported while synchronizing CVS Workspace. 0 of 1 resources were synchronized.
An error occurred synchronizing /abcd: I/O exception occurred: channel is broken
I/O exception occurred: channel is broken channel is broken
I was scratching my head for sometime and searched Internet for an answer, but none seemed to be appropriate for my case. Then when I tried to do the check out from command line using plink, I got a message from my CVS server stating that my password had expired and I needed to change my password. So I changed my password and everything started working fine.

If you get the error message above, one possible and easy to check root causes is your password expiry.

Monday, March 02, 2009

Understanding return codes of JDBC batchUpdate

Recently I had to make use of the JdbcTemplate.batchUpdate() facility in Spring. I was connecting to the Oracle database using Oracle JDBC driver. As per the documentation, the batchUpdate() function is supposted to return an integer array. Each element in the array contains the number of rows affected the respective INSERT/UPDATE/DELETE query in the batch. But during my testing I found that, I was always getting all the elements to be -2.

Initially I was thinking it was a bug in the driver code. Then when I was referring to the JDBC Programmers Guide, I figured the following:
For a prepared statement batch, it is not possible to know the number of rows affected in the database by each individual statement in the batch. Therefore, all array elements have a value of -2. According to the JDBC 2.0 specification, a value of -2 indicates that the operation was successful but the number of rows affected is unknown.

There are more examples and explonation of error codes in the same page. Especially pay more attention to the case when one of the statements throw an exception:
For example, if there were 20 operations in the batch, the first 13 succeeded, and the 14th generated an exception, then the update counts array will have 13 elements, containing actual update counts of the successful operations.

Hope this helps you are seeing the mysterious -2 as return code!

Tuesday, February 24, 2009

Gotchas with DBCP

I had to make use of DB connection pool in one of the applications that I was implementing. Since I am new to Java, everyone in the team was suggesting why don't you make use of Commons DBCP. Being a fan of reuse, I decided to make use of DBCP and started reading about the parameters that are exposed so that I can customize DBCP to fit my needs.

While reading through the available configuration parameters, I was thinking, "I don't need to check the connection to be bad on every borrow or return. It is sufficient if I check for being bad only when the connection is sitting idle in the pool." So I set the testWhileIdle to be true and set the validationQuery to be "SELECT 1 FROM DUAL". I was thinking that the configuration that I had was the optimal for the situation in hand.

But one thing I overlooked was, the default value for the testOnBorrow is true. So the moment you set the validationQuery to be any non-null value, every connection you borrow from the pool is going to be validated before it is returned to you. Effectively, if my application was executing 10 DB queries for each customer request, underneath there are 10 more validation requests being executed.

I realized this only when I ran some load tests. The performance was so poor that I couldn't belive what is wrong with my code. Especially even under moderate stress, the system was giving believable results.

So the moral of the story is, make sure you set all the parameters explicitly. Especially make sure you set testOnBorrow, testOnReturn and testWhileIdle parameters to fit your needs. Below is the list of properties that you can set/get in Apache Commons DBCP (version 1.2.2):
numActive (read only)
numIdle (read only)

wrapperFor (read only)

Okay, those are just the list of properties. There is one more important that you should keep in your mind regarding the eviction thread. The process of eviction is synchronized on the entire connection pool. This means that while the eviction process is going on, no other thread can borrow or return connections from or to the pool. That means a total freeze for any other thread that that wants to borrow or return at that time. There is one parameter that you can use to tune the freeze time: numTestsPerEvictionRun. The default value of this is 3. This means that the eviction thread will first acquire a lock on the connection pool, will inspect three connections in the pool and release connection.

In case you have set testWhileIdle to be true(and provided validationQuery), then for each connection inspected and decided to be retained, the validationQuery will be executed. This is very important to keep in mind, since this will increase the amount of time the connection pool is locked.

Let us say you know that the idle connections in the pool is very small and you know that your database takes only 1 millisecond to execute "SELECT 1 FROM DUAL" (which is your validation query), you might consider keeping the numTestsPerEvictionRun value high. But again, is 3 millisecond (for checking three connections) high or low is entirely dependent on your application.

One middle ground between these two extremes (testOnBorrow and testWhileIdle) is to selectively validate connections once in every N borrows. For e.g. you should be able to ask DBCP to validate a connection once in every 10 borrows. This way you don't have to have the eviction thread running as well as you will not have any long freezes. But this facility is NOT available in Apache DBCP now. And I am not aware of any other connection pool that is providing this facility either.

Keep in mind that the eviction thread only validates idle connections. If there is a bad connection and it was borrowed from the pool all the time the eviction thread is running, then you will see that sporadically transactions will fail. Running eviction thread with testWhileIdle set to true is not a fool proof way to safeguard the connection pool from bad connections.

Hope that helps someone who is facing a similar problem as I did.

SublimeText 3/Anaconda error

When I installed Anaconda manually by downloading and untarring the file (as given in the manual installation instructions here ), I got th...