Today I was playing around with Apache Solr. I was really impressed by its text searching capability, especially the MoreLikeThis search handler. I wanted to configure the DataImportHandler to import data directly from my MySQL database. It was really easy to configure, and I was able to perform a full import quickly. But when I tried to do a delta import, I found that it was not working as expected. Even though I was calling the delta import, it was causing a full import.
You might be wondering why I am saying all these here. Well, I suspected that the problem was actually because of my SQL query for delta load. But to be sure, I wanted to see the query being executed by Solr DataImportHandler. As always I turned to Google for assistance, and I finally reached the MySQL documentation on the General Query Log. Voila! This was exactly what I wanted. All I had to do was use the –log=[filename] parameter and all my queries will be logged to the specified log file. Nice, isn’t it?
Now I have to stop my running MySQL server and restart it with the –log switch, in addition to the other regular options. But there was a problem, I was not sure of the other required parameters. You can use the ps utility, when the MySQL server is running, to find out the normal parameters.
ps -ax | grep mysql
For me the output was
/usr/local/mysql/bin/mysqld –basedir=/usr/local/mysql –datadir=/usr/local/mysql/data –user=mysql –pid-file=/usr/local/mysql/data/localhost.pid –port=3306 –socket=/tmp/mysql.sock
Now shutdown the MySQL server.
// On Mac
/Library/StartupItems/MySQLCOM/MySQLCOM stop
// For other Linux/Unix variants try
/etc/init.d/mysqld restart
service mysql restart
Start mysqld with –log option
/usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data --user=mysql --pid-file=/usr/local/mysql/data/localhost.pid --port=3306 --socket=/tmp/mysql.sock --log=/tmp/query.log
The general query log contains lots of irrelevant information. To view the log after filtering out the unwanted details use tail and grep as given below
tail -f /tmp/query.log | grep -v Connect | grep -v Quit
The amount of information added to the file is quite large. If you are using this on a production server, I recommend turning off the logging once you are done.
one more pro-tip for you, you can pin multiple expressions together with grep ala:
grep -v -e Connect -e Quit