I have a mobile application sync process. The transaction does a lot of modification on the database. Since this is done on mobile I need to issue a VACUUM to compact the database.
I am wondering when should I issue a VACUUM
I am currently looking for SQLite, but if it's different for other engines, let me know in the answers (PostgreSQL, MySQL, Oracle, SQLServer)
I would say outside of the transaction. Certainly in PostgreSQL, VACUUM is designed to remove the "dead" tuples (i.e. the old row when a record has been changed or deleted.)
If you're running VACUUM in a transaction that has modified records, these dead rows won't have been marked for deletion.
Depending on which type of VACUUM you're doing, it may also require a table lock which will block if there are other transactions running, so you could potentially end up in a deadlock situation (transaction 1 is blocked waiting for a table lock to do its VACUUM, transaction 2 gets blocked waiting for a row to be released that transaction 1 has locked.)
I'd also recommend that this isn't done in an application (perhaps as a scheduled task) as it can take a while to complete and can negatively affect speed of other queries.
As for SQL Server, there is no VACUUM - what you're looking for is shrink. You can turn on auto shrink in 2005 which will automatically reclaim space when it the server decides, or issue a DBCC statement to shrink the database and log file, but this depends on your backup routine and strategy on a per-database level.
Want it or not when using PostgreSQL you can't run VACUUM in transaction as stated in the manual:
VACUUM cannot be executed inside a transaction block.
Vacuum is like defrag, it's good to do if youve recently deleted a lot of stuff, or maybe after youve inserted a lot of stuff, but by no means should you do it in every transaction. It's slower than almost any other database command and is more of a maintenance task.
We sometimes add/remove the majority of our db file, so then a vacuum would be a good idea, but I still would not consider it a part of the same transaction that did the work.
How frequently is the transaction run?
It's really a daily sort of process not a query by query process, but if you use it without full then it can be used in a transaction since it doesn't acquire a lock.
If your going to do it then it should be outside the transaction, since it is independent of the transactions data integrity.