One of the databases I’m using for my thesis project is storing a series of data-streams that need micro-second or even finer grained precision for the timestamps. Since I’m using a MySQL database, it seems that there is no native support for these type of timestamps (if this is not the case anymore, please comment!), and the person who designed the schema decided to use two fields to describe the timestamps:
- dataTimeStamp, as a DATETIME type.
- fractionalTimeStamp, as a DECIMAL(6,6) type.
The first attribute would store the timeStamp with a precision of up to a second, and then the other attribute would store the fractional values. This clearly works, but is far from being optimal. Specially when you have a huge dataset (days or weeks of continuous data) and want to extract a specific region in time.
As a quick solution, I decided to consolidate those two fields into a single BIGINT type column (I’m using field, column and attribute interchangeably here). The new column would contain a 64-bit integer with the first 32-bits used for a traditional UNIX timestamp (number of seconds since 00:00:00 1/1/1970, also known as the epoch), and the other 32-bits would be used for the fractional part of the timestamp, only this time in nano-seconds (10^-9).
To do this conversion, I’m thinking about using something like:
UPDATE power_data
SET timeStamp =
(CAST(UNIX_TIMESTAMP(dataTimeStamp) as UNSIGNED) << 32 | CAST(fractionalTimeStamp * pow(10,9) AS UNSIGNED));
I'm still not sure if this is such a great idea. On the positive side, this new column should sort the table well (instead of having to sort by the two previous columns). On the negative side, I would still have to convert those numbers back into meaningful dates whenever I want to use them.
Another option would be to use the full 64 bits to encode the number of nano-seconds since the epoch. This would be OK for the next 500+ years, since 2^64 is approximately 1.85 x 10^19, and the number of nano-seconds since the epoch on the year 2510 is approximately 1.57 x 10^19. This approach would also sort well, and would need a less complicated parsing process in the end.
What do you think? I think I'm going to go with the second option.
Last revised on 2010/01/31