Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/InfluxDB - Increase time precision to microseconds for timestamp fields #851

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

afausti
Copy link
Contributor

@afausti afausti commented May 6, 2022

#626 assumes Unix timestamp in seconds if the type is double and coerce to Long in milliseconds. In this PR we increase the time precision to microseconds for timestamp fields.

I think a mechanism to set the time precision as proposed in #850 would be a better solution though.

@davidsloan
Copy link
Collaborator

davidsloan commented May 9, 2022

Hi, thanks for the pull request.

However I do not believe it is necessary.

I would have thought you should be able to achieve what you are looking to do by adding TIMESTAMPUNIT to the KCQL.

INSERT INTO table SELECT col1,col2 FROM topic TIMESTAMPUNIT=MICROSECONDS

We would prefer to keep this user-configurable via KCQL than to have it hard-coded to a particular unit.

Having looked at the documentation it does seem this is missing, so I will make a note to add this in the near future.

@afausti afausti force-pushed the feature/influxdb-time-precision branch from 006fc00 to d1ed145 Compare May 10, 2022 19:16
@afausti
Copy link
Contributor Author

afausti commented May 10, 2022

Thanks @davidsloan!

Indeed I just tested the TIMESTAMPUNIT to specify the precision of the timestamp field and that worked, so I reverted the following:

@@ -72,7 +72,7 @@ object InfluxPoint {
       .flatMap { path =>
         record
           .field(path)
-          .map(coerceTimeStamp(_, path.value).map(details.timestampUnit -> _))
+          .map(coerceTimeStamp(_, path.value).map(TimeUnit.MICROSECONDS -> _))
       }
       .getOrElse(Try(TimeUnit.NANOSECONDS -> nanoClock.getEpochNanos))

But I'm still multiplying the double timestamp in seconds by 1E6 to convert it to microseconds for this to work.

My Scala is not good enough to make this more generic, but I think the idea is to use the TIMESTAMPUNIT value to multiply the double timestamp by the correct factor before coercing it to Long.

Here's the connector configuration I've used with this PR

{
    "connect.influx.db": "mydb",
    "connect.influx.error.policy": "THROW",
    "connect.influx.kcql": "INSERT INTO Test.logevent_heartbeat SELECT * FROM Test.logevent_heartbeat WITHTIMESTAMP private_efdStamp TIMESTAMPUNIT=MICROSECONDS",
    "connect.influx.max.retries": "10",
    "connect.influx.password": "",
    "connect.influx.retry.interval": "60000",
    "connect.influx.url": "http://influxdb:8086",
    "connect.influx.username": "-",
    "connect.progress.enabled": "false",
    "connector.class": "com.datamountaineer.streamreactor.connect.influx.InfluxSinkConnector",
    "name": "influxdb-sink",
    "tasks.max": "1",
    "topics": "Test.logevent_heartbeat"
}
docker-compose exec influxdb influx -database mydb -execute 'SELECT private_efdStamp FROM "Test.logevent_heartbeat"'
1652209485367887000 1652209485.367887
1652209486389710000 1652209486.3897102
1652209487411820000 1652209487.41182
1652209488434421000 1652209488.4344213
1652209489456561000 1652209489.4565613
1652209490478031000 1652209490.4780314
1652209491500193000 1652209491.500193
1652209492521670000 1652209492.52167

Can I get some help to make this more generic?

I'll edit issue #850 to explain that TIMESTAMPUNIT is currently not working with double timestamps from the tests above.

@davidsloan
Copy link
Collaborator

Let me have a little more of a think, try a few scenarios and get back to you.

My first thought is that maybe the TimestampConverter SMT could help here, as it is quite versatile.

@afausti
Copy link
Contributor Author

afausti commented Jul 1, 2022

@davidsloan I was looking at TimestampConverter and looks like it cannot change the timestamp precision when writing Unix timestamps.

I think the path forward is to use the TIMESTAMPUNIT information to multiply the double timestamp in seconds by the corresponding factor. I'm happy to discuss this idea further if needed.

- For double timestamp fields, increase time precision from milliseconds to microseconds
- It is now coerced to Long with microseconds precision
@afausti afausti force-pushed the feature/influxdb-time-precision branch from d1ed145 to c6f57bc Compare September 8, 2022 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants