Skip to content Skip to sidebar Skip to footer

Why Do These Datetime Values Return The Same Checksum & Checksum_agg? How Can I Make It More Unique?

I'm trying to quickly determine if two sets of schedules are identical and also generate a key which these unique schedules can be references by. I originally attempted to use HASH

Solution 1:

Do you think all this checksum stuff - given what you have to also do to ensure uniqueness - is worth the trouble? Personally I think you will get better performance (and less complexity) just comparing the columns directly instead of trying to reduce your work and compare only one value.

Also keep in mind that datetime values are just pairs of integers when you get down to it, so it is not really all that surprising that applying checksum to the combination of the two values may lead to the same values. For example:

SELECT CHECKSUM_AGG(x)
FROM
(
  SELECT CHECKSUM(1,2)
  UNION ALL 
  SELECT CHECKSUM(2,3)
) AS y(x);

SELECT CHECKSUM_AGG(x)
FROM
(
  SELECT CHECKSUM(2,2)
  UNION ALL 
  SELECT CHECKSUM(1,3)
) AS y(x);

Results:

----
49

----
49

So I suggest just put an index on StartDate, EndDate and be done with it. You're trying to make something that's already pretty efficient more efficient, and I think you are accomplishing the opposite.

As for the key, just use an IDENTITY column or some other surrogate. I see no advantage to nesting CHECKSUM_AGG(CHECKSUM(HASHBYTES(col1),HASHBYTES(col2))) to simulate uniqueness...

EDIT

Or given the new requirement, just use a ROWVERSION column if you want to make sure the data is the same as the last time you read it. I don't see how tracking millions of checksum results is any different from tracking rowversion or otherwise calculated values. You're working way too hard when there are already built-in things that do what you're trying to do...

Solution 2:

From a comment on this page:

http://msdn.microsoft.com/en-us/library/ms188920.aspx

It appears that Checksum_Agg is built by using XORs. And the thing about XORs is, they tend to be easily reversible by including the same number twice. This explains why you've only noticed it when it's even.

As long as you're aware of the XOR issue and pre-scramble what you feed to it in a way that mixes up all the bits you should be OK.

Solution 3:

i faced this isssue too. That appears when you have all values in column same. Probably it doesn't take this column, when calculating sum.

Post a Comment for "Why Do These Datetime Values Return The Same Checksum & Checksum_agg? How Can I Make It More Unique?"