Nano-second timestamps in MySQL

2010/01/29

en Research

One of the databases I’m using for my thesis project is storing a series of data-streams that need micro-second or even finer grained precision for the timestamps. Since I’m using a MySQL database, it seems that there is no native support for these type of timestamps (if this is not the case anymore, please comment!), and the person who designed the schema decided to use two fields to describe the timestamps:

  • dataTimeStamp, as a DATETIME type.
  • fractionalTimeStamp, as a DECIMAL(6,6) type.

The first attribute would store the timeStamp with a precision of up to a second, and then the other attribute would store the fractional values. This clearly works, but is far from being optimal. Specially when you have a huge dataset (days or weeks of continuous data) and want to extract a specific region in time.

As a quick solution, I decided to consolidate those two fields into a single BIGINT type column (I’m using field, column and attribute interchangeably here). The new column would contain a 64-bit integer with the first 32-bits used for a traditional UNIX timestamp (number of seconds since 00:00:00 1/1/1970, also known as the epoch), and the other 32-bits would be used for the fractional part of the timestamp, only this time in nano-seconds (10^-9).

To do this conversion, I’m thinking about using something like:

UPDATE power_data
SET timeStamp =
(CAST(UNIX_TIMESTAMP(dataTimeStamp) as UNSIGNED) << 32 | CAST(fractionalTimeStamp * pow(10,9) AS UNSIGNED));

I'm still not sure if this is such a great idea. On the positive side, this new column should sort the table well (instead of having to sort by the two previous columns). On the negative side, I would still have to convert those numbers back into meaningful dates whenever I want to use them.

Another option would be to use the full 64 bits to encode the number of nano-seconds since the epoch. This would be OK for the next 500+ years, since 2^64 is approximately 1.85 x 10^19, and the number of nano-seconds since the epoch on the year 2510 is approximately 1.57 x 10^19. This approach would also sort well, and would need a less complicated parsing process in the end.

What do you think? I think I'm going to go with the second option.

Last revised on 2010/01/31

  • fmanzano

    Tambien opino que la segunda opcion es mejor.
    Con la velocidad actual de los uP.
    Puedes tambien muestrear el sensado(dependiendo de lo critico del proceso) calcular estadsiticos para esas muestras y solo almacenar los mismos, con un TIME STAMP de menor resolucion

    No conozco el Objeto de Sensado, pero ese aproach tambien funciona

    Es como usuar el Teorema de Nyquist para evitar el Sobre muestreo o el submuestreo.
    Siempre se puede encontrar un funcion de interpolacion para la data sensada en procesos macro fisicos (o de fisico-clasicos)

  • fmanzano

    Por cierto…excelente presentacion

    Ni te imagines volver……… :-) …ven de vacaciones……y cuando necesites de la bohemia rural y urbana

    FM

  • http://blog.marioberges.com marioberges

    En este caso necesito toda la data porque parte de la investigación es encontrar cuál es la más eficiente tasa de muestreo.

  • http://blog.marioberges.com marioberges

    Gracias.

Entrada anterior:

Próxima entrada: