WITH CTE AS
(
SELECT fips, cnty_name,
ROW_NUMBER() OVER (PARTITION BY fips, cnty_name ORDER BY fips, cnty_name) AS DuplicateCount
FROM zip_to_fips
WHERE fips IS NOT NULL AND cnty_name IS NOT NULL
)
SELECT fips, cnty_name, COUNT(*) AS DuplicateCount
FROM CTE
WHERE DuplicateCount >= 1
GROUP BY fips, cnty_name
ORDER BY DuplicateCount DESC
This query uses a commonly used query (CTE) to group the results and count duplicates in the zip_to_fips
table.
The first part of the query, the CTE, selects the columns «fips», «cnty_name» and adds an additional column «DuplicateCount» that uses the ROW_NUMBER()
function to assign a unique number to each row that has equal values in the «fips» and «cnty_name» columns. The PARTITION BY
clause indicates the columns by which to group and the ORDER BY
clause indicates the order in which to assign numbers.
The second part of the query uses the CTE to select the columns «fips», «cnty_name» and uses the COUNT(*)
function to count the number of times each combination of «fips» and «cnty_name» appears. The GROUP BY
clause indicates the columns by which to group the result. Finally, the ORDER BY
clause indicates the order in which to sort the results, in this case by DuplicateCount
in descending order.
The query filters only the rows that have non-null values in the «fips» and «cnty_name» columns using the WHERE fips IS NOT NULL AND cnty_name IS NOT NULL
clause. In addition, the query also filters only the results with DuplicateCount
greater than 1, i.e., those that have duplicates.