I'm using nested sets to store hierarchical data in a MyISAM table; the table consists of several hierarchical sets for each user. Each user will be the only one writing to his respective trees, but other users may read from them. Node deletion / Insertion requires that other rows in the same tree have their lft and rgt values updated, potentially hundreds of rows.
In order to do this, I need to get a table write lock, update the other nodes in the tree, delete/insert the row and unlock the table.
What I'm wondering is this -- Do table locks scale to hundreds of concurrent users? thousands?
Would InnoDB's row locks be more efficient in this case? (locking a few hundred rows that will mostly be used only by the user himself)
If I were to use row locks, do I need to add explicit logic to deal with deadlock errors?
Well, the philosophy on locking is different between the two engines.
With MyISAM, the reason for full table locking is that writes should normally be fast. There are only two operation needed for the write (Lock table, then write row to disk). MyISAM performance is really bound by disk speed for this reason.
With InnoDB, it gets a little more complicated. Since it's fully ACID compliant, every write takes 4 steps (Lock row, write to transaction log, write row to dis, write to transaction log). Note that it writes to the disk three times. So that means that (in practice) an InnoDB write will take 3 times longer than a MyISAM write. That's one reason for the row level locking (transactions are another).
But it's not that easy. With MyISAM, the table lock requires one semaphore for that table. So the impact on both memory usage and speed are trivial at best. With InnoDB however, it requires an index and one semaphore per row. It needs an index to speed up the "check" to see if there's already a lock for the row. Now, if you're updating one or 10 rows at the same time, there's little difference. But when you're talking millions of rows the difference can be non-trivial (both in memory usage and speed, since it needs to transverse the lock "index" for each row to be locked).
There is also an additional tradeoff. Since InnoDB is ACID compliant, if there's a power loss (or other crash), you're never left in an inconsistent state. There's no uncommitted transaction's data in the db, and there's no committed transaction corrupted (it will automatically run the transaction log if it detects something to fix it). With MyISAM, a power loss (or crash) during a write can leave the table in an inconsistent state and there's nothing you can do about it. If you care about your data, InnoDB would be better. But, with good Binary logs and a backup system, you should be able to recover MyISAM, but it will require some manual intervention...
Now, with that said, your question of which scales better is really hard. First, are most of your writes dealing with one or two rows? If so, InnoDB and Row level locking will tend to scale better. If you do a lot of queries updating a lot of rows at the same time (tens of thousands and up), you'll notice that MyISAM will tend to have better performance.
As for your question of deadlocks, MySQL will locate and handle them for you (but it won't execute one of the queries, so you may want some exception handling code to either retry the query or something else). The internal system will prevent the deadlock...
Now, another note. Since MySQL supports more than one engine in a db, why not put your data into InnoDB, and then make a MyISAM join table to handle the nested set data? Store parenting info in the data table (via a
parent_id mechanism). That way, all your data is in an ACID compliant db, but you can gain the speed increase by using the faster (for reading and large writes) MyISAM for the nested set logic...