Skip to content Skip to sidebar Skip to footer

Slow Self-join Delete Query

Does it get any simpler than this query? delete a.* from matches a inner join matches b ON (a.uid = b.matcheduid) Yes, apparently it does... because the performance on the abo

Solution 1:

Deleting so many records can take a while, I think this is as fast as it can get if you're doing it this way. If you don't want to invest into faster hardware, I suggest another approach:

If you really want to delete 220 million records, so that the table then has only 15.000 records left, thats about 99,999% of all entries. Why not

  1. Create a new table,
  2. just insert all the records you want to survive,
  3. and replace your old one with the new one?

Something like this might work a little bit faster:

/* creating the new table */CREATETABLE matches_new
SELECT a.*FROMmatches a
LEFTJOINmatches b ON (a.uid = b.matcheduid)
WHERE ISNULL (b.matcheduid)

/* renaming tables */
RENAME TABLEmatchesTO matches_old;
RENAME TABLE matches_new TOmatches;

After this you just have to check and create your desired indexes, which should be rather fast if only dealing with 15.000 records.

Solution 2:

running explain select a.* from matches a inner join matches b ON (a.uid = b. matcheduid) would explain how your indexes are present and being used

Solution 3:

I might be setting myself up to be roasted here, but in performing a delete operation like this in the midst of a self-join, isn;t the query having to recompute the join index after each deletion?

While it is clunky and brute force, you might consider either:

A. Create a temp table to store the uid's resulting from the inner join, then join to THAT, THEN perfoorm the delete.

OR

B. Add a boolean (bit) typed column, use the join to flag each match (this operation should be FAST), and THEN use:

DELETE*FROMmatchesWHERE YourBitFlagColumn =True

Then delete the boolean column.

Solution 4:

You probably need to batch your delete. You can do this with a recursive delete using a common table expression or just iterate it on some batch size.

Post a Comment for "Slow Self-join Delete Query"