After initial read of /dev/kmsg, ignore kernel timestamp

The spec[1] says the /dev/kmsg timestamp is a monotonic clock and in microseconds. After a while you realize it's also relative to the boot of the system, that fact was probably too obvious to be put in the spec. However, what's *not* in the spec, and what takes a while to realize, is that this monotonic time is *not* adjusted for suspend/resume cycles ... On a frequently used laptop this can manifest itself as follows. The kernel is stuck on Nov 15, and for the life of me I cannot find any to adjust for this offset: $ dmesg -T |tail -1; date [Mon Nov 15 01:42:08 2021] wlan0: Limiting TX power to 23 (23 - 0) dBm as advertised by 18:e8:29:55:b0:62 Tue 23 Nov 2021 05:20:53 PM CET Hence this patch. After initial "emptying" of /dev/kmsg when syslogd starts up, we raise a flag (denoting done with backlog), and after this point we ignore the kernel's idea of time and replace it with the actual time we have now, the same that userspace messages are logged with. Sure, there will be occasions where there's a LOT of kernel messages to read and we won't be able to keep track. Yet, this patch is better than the current state (where we log Nov 15). [1]: https://www.kernel.org/doc/Documentation/ABI/testing/dev-kmsg Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
2021-11-23 17:16:17 +01:00
parent c7e30c6bd0
commit 9f6fbb3301
1 changed files with 26 additions and 3 deletions
--- a/src/syslogd.c
+++ b/src/syslogd.c
@@ -125,6 +125,7 @@ static int	  Debug;		/* debug flag */
 static int	  Foreground = 0;	/* don't fork - don't run in daemon mode */
 static time_t	  boot_time;		/* Offset for printsys() */
 static uint64_t	  sys_seqno = 0;	/* Last seen kernel log message */
+static int	  sys_seqno_init;	/* Timestamp can be in the past, use 'now' after first read */
 static int	  resolve = 1;		/* resolve hostname */
 static char	  LocalHostName[MAXHOSTNAMELEN + 1]; /* our hostname */
 static char	 *LocalDomain;			     /* our local domain name */
@@ -224,6 +225,7 @@ static void sys_seqno_load(void)
 			break; /* str began with a number but has junk left over at the end */

 		sys_seqno = val;
+		sys_seqno_init = 1; /* Ignore sys timestamp from now */
 	}
 	fclose(fp);
 }
@@ -244,6 +246,8 @@ static void sys_seqno_save(void)
 	fclose(fp);

 	prev = sys_seqno;
+
+	sys_seqno_init = 1;	/* Ignore sys timestamp from now */
 }

 int usage(int code)
@@ -1189,7 +1193,7 @@ void printsys(char *msg)
 				++p;
 		} else if (isdigit(*p)) {
 			/* Linux /dev/kmsg: "pri,seq#,msec,flag[,..];msg" */
-			time_t now = boot_time;
+			time_t now;

 			/* pri */
 			buffer.pri = 0;
@@ -1218,8 +1222,27 @@ void printsys(char *msg)
 			/* timestamp */
 			while (isdigit(*p))
 				ustime = 10 * ustime + (*p++ - '0');
-			now += ustime / 1000000;
-			buffer.timestamp.usec = ustime % 1000000;
+
+			/*
+			 * When syslogd starts up, we assume this happens at
+			 * close to system boot, we read all kernel logs from
+			 * /dev/kmsg (Linux) and calculate the precise time
+			 * stamp using boot_time + usec to get the time of a
+			 * log entry.  However, since the kernel time stamp
+			 * is not adjusted for suspend/resume it can be many
+			 * days (!) off after a few weeks of uptime.  It has
+			 * turned out to be quite an interesting problem to
+			 * compensate for, so at runtime we instead use the
+			 * current time of any new kernel messages.
+			 *     -- Joachim Wiberg Nov 23, 2021
+			 */
+			if (!sys_seqno_init) {
+				now = boot_time + ustime / 1000000;
+				buffer.timestamp.usec = ustime % 1000000;
+				localtime_r(&now, &buffer.timestamp.tm);
+			} else
+				now = time(NULL);
+
 			localtime_r(&now, &buffer.timestamp.tm);

 			/* skip flags for now */