Remove Duplicates From A Table And Re-link Referencing Rows To The New Master
I have a table transcription which contains passages of transcribed text and their citations with columns: text, transcription_id(PK), t_notes, citation and the second table town_t
Solution 1:
This single command should do it all:
WITH blacklist AS ( -- identify duplicate IDs and their master
SELECT *
FROM (
SELECT transcription_id
, min(transcription_id) OVER (PARTITION BY text, citation) AS master_id
FROM transcription
) sub
WHERE transcription_id <> master_id
)
, upd AS ( -- redirect referencing rows
UPDATE town_transcription tt
SET transcription_id = b.master_id
FROM blacklist b
WHERE b.transcription_id = tt.transcription_id
)
DELETE FROM transcription t -- kill dupes (now without reference)
USING blacklist b
WHERE b.transcription_id = t.transcription_id;
For lack of definition I chose the row with the smallest ID per group as surviving master row.
FK constraints don't get in the way unless you have non-default settings. Detailed explanation:
- How to remove duplicate rows with foreign keys dependencies?
- Delete duplicates and reroute referencing rows to new master
After removing the dupes you might now want to add a UNIQUE constraint to prevent the same error from reoccurring:
ALTER TABLE transcription
ADD CONSTRAINT transcription_uni UNIQUE (text, citation);
Solution 2:
Use row_number() over(...) to identify rows that repeat information. A partition by text, citation in the over clause will force the row number series to re-start at 1 for each unique set of those values:
select
*
from (
select
text, transcription_id, t_notes, citation
, row_number() over(partition by text, citation
order by transcription_id) as rn
from transcription
) d
where rn > 1
Once you have verified those as the unwanted rows,then use the same logic for a delete statement.
However, you may loose information held in the t_notes column - are you willing to do that?
Post a Comment for "Remove Duplicates From A Table And Re-link Referencing Rows To The New Master"