sql - Deleting duplicates rows from redshift -
i trying delete duplicate data in redshift table.
below query:-
with duplicates (select *, row_number() on (partition record_indicator order record_indicator) duplicate table_name) delete duplicates duplicate > 1 ;
this query giving me error.
amazon invalid operation: syntax error @ or near "delete";
not sure issue syntax clause seems correct. has faced situation before?
redshift being (no enforced uniqueness column), ziggy's 3rd option best. once decide go temp table route more efficient swap things out whole. deletes , inserts expensive in redshift.
begin; create table table_name_new select distinct * table_name; alter table table_name rename table_name_old; alter table table_name_new rename table_name; drop table table_name_old; commit;
if space isn't issue can keep old table around while , use other methods described here validate row count in original accounting duplicates matches row count in new.
if you're doing constant loads such table you'll want pause process while going on.