Any help would be appreciated here. My question is two part:
1.) Select duplicates that match any of a number of columns for example...
The email is the same OR
The homephone is the same OR
The mobilephone is the same OR
The address1 is the same
The uniqueID is ConsIntID
My Select Statement: (Which does not work)
SELECT CONSINTID, FIRSTNAME, LASTNAME, EMAIL1, ADDRESS1, HOMEPHONE, MOBILEPHONE, CREATEDATE, USERIDS
FROM CONSULTANTS
WHERE CONSINTID IN (SELECT CONSINTID
FROM CONSULTANTS
HAVING COUNT(HOMEPHONE) > 1 OR COUNT(MOBILEPHONE) >1 OR etc.... )
-
Once the Select Duplicates Statement is corrected I will insert those into a ConsultantsDupe Table
The Second Question is: If I have more then one duplicate my stored_procedure fails
Here is my cursor and logic. The problem is getting multiple results in the subquery and I don't know the best way to correct this.
DECLARE DUPES CURSOR FOR
SELECT CONSINTID, FIRSTNAME, LASTNAME, EMAIL1, ADDRESS1, HOMEPHONE, MOBILEPHONE, CREATEDATE, USERIDS
FROM CONSULTANTS_DUPS
ORDER BY CREATEDATE DESC
OPEN DUPES
FETCH NEXT FROM DUPES
INTO @.CONSINTID, @.FIRSTNAME, @.LASTNAME, @.EMAIL1, @.ADDRESS1, @.HOMEPHONE, @.MOBILEPHONE, @.CREATEdATE, @.USERIDS
WHILE @.@.FETCH_STATUS = 0
BEGIN
SET @.FOUNDCONSINTID = (SELECT DISTINCT CONSINTID FROM CONSULTANTS_DUPS
WHERE CONSINTID <> @.CONSINTID AND HOMEPHONE = @.HOMEPHONE "NEED TO ADD COMPARISON FOR MOBILEPHONE, EMAIL ADDRESS HERE") <== This can provide more then one result which messes me up.
I'll take a couple shots at this. Hopefully somebody can post something even better. You'll have to forgive me. Without any sample tables I had to "shoot from the hip" and write it without any syntax checking.
SELECT * FROM Consultants C
WHERE EXISTS (SELECT * FROM Consultants CPhone
WHERE (C.HomePhone = CPhone.HomePhone
OR C.HomePhone = CPhone.CellPhone
OR C.CellPhone = CPhone.HomePhone
OR C.CellPhone = CPhone.CellPhone)
AND C.CONSINTID <> CPhone.CONSINTID
)
OR EXISTS (SELECT * FROM Consultants CEmail
WHERE C.EMAIL1 = CEmail.EMAIL1
AND C.CONSINTID <> EMAIL1.CONSINTID)
OR EXISTS ...
Here's another shot:
SELECT * FROM Consultants
WHERE HomePhone IN (SELECT HomePhone
FROM Consultants
WHERE HomePhone IS NOT NULL
GROUP BY HomePhone
HAVING COUNT(*) > 1)
OR CellPhone IN (SELECT CellPhone
FROM Consultants
WHERE CellPhone IS NOT NULL
GROUP BY CellPhone
HAVING COUNT(*) > 1)
OR Email1 IN (SELECT Email1
FROM Consultants
WHERE Email1 IS NOT NULL
GROUP BY Email1
HAVING COUNT(*) > 1)
This method suffers from the fact that you can't check HomePhone and CellPhone at the same time (with UNION ALL) because of a contact has the same phone number listed as home and cell then it could show up as a duplicate.
Here's where you might take the previous query as a natural transition to cross-check home phone against cell phone:
SELECT * FROM Consultants C
WHERE HomePhone IN (SELECT HomePhone
FROM (SELECT CONSINTID, HomePhone
FROM Consultants UNION
SELECT CONSINTID, CellPhone
FROM Consultants) SubQ
WHERE HomePhone IS NOT NULL
GROUP BY HomePhone
HAVING COUNT(*) > 1)
OR CellPhone IN (SELECT HomePhone
FROM (SELECT CONSINTID, HomePhone
FROM Consultants UNION
SELECT CONSINTID, CellPhone
FROM Consultants) SubQ
WHERE HomePhone IS NOT NULL
GROUP BY HomePhone
HAVING COUNT(*) > 1)
OR Email1 IN (SELECT Email1
FROM Consultants
WHERE Email1 IS NOT NULL
GROUP BY Email1
HAVING COUNT(*) > 1)
You may also be better off with multiple SELECT statements as it's hard to tell the cause of your duplicate in a query like this. Keep in mind that all of these queries are very read-intensive as you're dealing with multiple scans across your table. You may benefit, instead, by placing your duplicates in a temporary table (or table variable) and then querying against the temporary table.
|||something like this:select *
from consultants
where consintid in
(select consintid
from consultants
group by email
having count(*)>1
union all
select consintid
from consultants
group by homephone
having count(*)>1
union all
select consintid
from consultants
group by cellphone
having count(*)>1
union all
select consintid
from consultants
group by address1
having count(*)>1
)|||
Jared,
Yours worked! This gives me what I need for the duplicate temp table and the assistance is greatly appreciated.
Any feedback on the second portion?
No comments:
Post a Comment