-
-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What is the best practice to read date type value? #1565
Comments
I am also experiencing this, the dates are wrong already in the workbook returned by I am having trouble finding a good workaround for this. Could you explain your workaround @jngbng? Did you manually add 4 years to dates in the workbook before converting it to JSON when |
@zoeesilcock There are two more problems. SSF module output, instead of JS native Date, is preferable to represent date when importing excel file. TL;DR. I am using following workaround code. // Take following code from [email protected].
// They are private scoped and inaccessible from outside of the library.
const basedate = new Date(1899, 11, 30, 0, 0, 0);
const dnthresh = basedate.getTime() + (new Date().getTimezoneOffset() - basedate.getTimezoneOffset()) * 60000;
const day_ms = 24 * 60 * 60 * 1000;
const days_1462_ms = 1462 * day_ms;
function datenum(v, date1904) {
let epoch = v.getTime();
if (date1904) {
epoch -= days_1462_ms;
}
return (epoch - dnthresh) / day_ms;
}
function fixImportedDate(date, is_date1904) {
// Convert JS Date back to Excel date code and parse them using SSF module.
const parsed = xlsx.SSF.parse_date_code(datenum(date, false), {date1904: is_date1904});
return `${parsed.y}-${parsed.m}-${parsed.d}`;
// or
// return parsed;
// or if you want to stick to JS Date,
// return new Date(parsed.y, parsed.m, parsed.d, parsed.H, parsed.M, parsed.S);
}
function useSSFOutput() {
const wb = xlsx.readFile('./tz_test_dates.xlsx', {cellDates: true});
const sheet = wb.Sheets[(wb.SheetNames[0])];
// original output
const converted = xlsx.utils.sheet_to_json(sheet, {header: 1, cellDates:true});
// apply hotfix
const is_date1904 = wb.Workbook.WBProps.date1904;
const fixed = converted.map((arr) => arr.map((v) => {
if (v instanceof Date) {
return fixImportedDate(v, is_date1904);
} else {
return v;
}
}));
console.log(fixed.map(arr => arr.map(v => v.toString())));
}
useSSFOutput(); Run above code with tz_test_dates.xlsx and will get following result: tz_test_dates.xlsx preview:
Detail
var basedate = new Date(1899, 11, 30, 0, 0, 0); // 2209161600000
var dnthresh = basedate.getTime() + (new Date().getTimezoneOffset() - basedate.getTimezoneOffset()) * 60000;
function numdate(v) {
var out = new Date();
out.setTime(v * 24 * 60 * 60 * 1000 + dnthresh);
return out;
} issue 1: precision bug. refer to #1470On some countries, you may lose some time (in Korea, -52 sec when parsing). function showOriginal() {
const wb = xlsx.readFile('./tz_test_dates.xlsx', {cellDates: true});
const sheet = wb.Sheets[(wb.SheetNames[0])];
// original output
const converted = xlsx.utils.sheet_to_json(sheet, {header: 1, cellDates:true});
console.log('showOriginal:');
console.log(converted.map(arr => arr.map(v => v.toString())));
}
showOriginal();
/////////////////////////////////////
function getTimezoneOffsetMS(date) {
var time = date.getTime();
var utcTime = Date.UTC(date.getFullYear(),
date.getMonth(),
date.getDate(),
date.getHours(),
date.getMinutes(),
date.getSeconds(),
date.getMilliseconds());
return time - utcTime;
}
const importBugHotfixDiff = (function () {
const basedate = new Date(1899, 11, 30, 0, 0, 0);
const dnthreshAsIs = (new Date().getTimezoneOffset() - basedate.getTimezoneOffset()) * 60000;
const dnthreshToBe = getTimezoneOffsetMS(new Date()) - getTimezoneOffsetMS(basedate);
return dnthreshAsIs - dnthreshToBe;
}());
function fixPrecisionLoss(date) {
return (new Date(date.getTime() - importBugHotfixDiff));
}
function showPrevisionLossHotfix() {
const wb = xlsx.readFile('./tz_test_dates.xlsx', {cellDates: true});
const sheet = wb.Sheets[(wb.SheetNames[0])];
// original output
const converted = xlsx.utils.sheet_to_json(sheet, {header: 1, cellDates:true});
// apply hotfix. ignore date1904 problem for now.
const fixed = converted.map((arr) => arr.map((v) => {
if (v instanceof Date) {
return fixPrecisionLoss(v);
} else {
return v;
}
}));
console.log('showPrevisionLossHotfix:');
console.log(fixed.map(arr => arr.map(v => v.toString())));
}
showPrevisionLossHotfix(); Run above code after setting computer's time zone to Asia/Seoul (UTC+09:00) then will get:
Notice that 52 seconds error has gone, but '1960-01-01' and '1908-01-01' are not correctly parsed. issue2: timezone offset is not constant within one time zone.Noice that // --------------------------------------------------
// Take following code from [email protected].
// They are private scoped and inaccessible from outside of the library.
//
const basedate = new Date(1899, 11, 30, 0, 0, 0);
const dnthresh = basedate.getTime() + (new Date().getTimezoneOffset() - basedate.getTimezoneOffset()) * 60000;
const day_ms = 24 * 60 * 60 * 1000;
const days_1462_ms = 1462 * day_ms;
function datenum(v, date1904) {
let epoch = v.getTime();
if (date1904) {
epoch -= days_1462_ms;
}
return (epoch - dnthresh) / day_ms;
}
// -------------------------------------------------
function fixImportedDate(date, isDate1904) {
const parsed = xlsx.SSF.parse_date_code(datenum(date, false), {date1904: isDate1904});
// return `${parsed.y}-${parsed.m}-${parsed.d}`;
return new Date(parsed.y, parsed.m, parsed.d, parsed.H, parsed.M, parsed.S);
}
function useSSFOutput() {
const wb = xlsx.readFile('./tz_test_dates.xlsx', {cellDates: true});
const sheet = wb.Sheets[(wb.SheetNames[0])];
// original output
const converted = xlsx.utils.sheet_to_json(sheet, {header: 1, cellDates:true});
// apply hotfix
const isDate1904 = wb.Workbook.WBProps.date1904;
const fixed = converted.map((arr) => arr.map((v) => {
if (v instanceof Date) {
return fixImportedDate(v, isDate1904);
} else {
return v;
}
}));
console.log('useSSFOutput:');
console.log(fixed.map(arr => arr.map(v => v.toString())));
}
useSSFOutput(); Above code gives following result: On LosAngeles timezone:
On Asia/Seoul timezone:
|
@jngbng thanks for going over this! Was not aware of the minute differences in KST. The date handling is a bit of a mess and we likely have to revisit both SSF and this library. In the next push, we're going to deprecate the SSF repo and merge the contents as a package in this repo to make it easier to update both at the same time (they are separate because originally the XLS parsing was a separate library and both used this component) Since you looked into this a bit, maybe you can share your opinion about the best representation for the library: Excel datesUnder the hood, Excel dates are numbers representing the number of days (+ fraction of a day) from a given epoch. The default epoch is December 31 1899 (rendered as "1/0/00"). When the 1904 Date System is used, the default epoch is January 1 1904. As is the case with JS, Excel days are exactly 86400 seconds long, even during DST adjustments. How Excel deviates from JS
This is why SSF has a special date struct -- to support the date that JS cannot.
File DatesExcel has three ways of storing dates in files:
What SheetJS currently attempts to doThe internal representation of date cells is the date as understood in the JS engine timezone. For example, in ET (currently UTC-400), a date cell like <c r="A1" t="d">
<v>2019-10-31T00:00:00.000Z</v>
</c> or a number cell using a date format like <c r="A2" s="1">
<v>43769</v>
</c> will be parsed and stored as The main reason is to make the easier to supply dates from JS. For example: var wb = XLSX.utils.book_new();
var ws = { A1: {t:"d", v:new Date("October 31 2019 3:00 PM") }, "!ref": "A1:A1" };
XLSX.utils.book_append_sheet(wb, ws, "Sheet1");
XLSX.writeFile(wb, "test1565.xlsx"); will create a file that stores the date code Design goals and considerations"Date representation": Converting between XLSX type "d" and Excel Date codes should work irrespective of the computer timezone setting. "Round-trip": Reading a file and writing it back should always store the same date. "Cross-Timezone": Server-side processing is tricky because the timezone of the user and server oftentimes differ. "Limited complexity": Limited date shifting (adding or subtracting the timezone offset) in the end-user code Since you mentioned that the timezone offset changed by a little bit, we'll rethink that part. |
I have not looked into this as deeply yet, but it is also a problem for me - ideally I want to interpret dates as UTC, perhaps this could be an option? I realise that you may then run into round-tripping problems. It also seems in new versions as opposed to old ones - say Naively, as a first solution to the fractional timezone issues, the timezone offset could be worked out using |
This thread has ballooned to include a general discussion about handling dates. Could we return to the original issue, namely that |
I've just been bitten by this—shouldn't we autodetect which date format we have, instead of returning an off-by-4-years date? |
@Fil "off by 4 years" sounds like a problem with the date system (1900 vs 1904) which is not related to the issue at hand. Please raise a new issue and include an example |
I don't understand your comment. The issue as described in the OP gives the example of a file with four rows that contain a date of 2019-01-01. When it's opened with xlsx, three of the dates are transformed (on my computer in European timezone) into 2014-12-30T23:59:39, which is off by 4 years, 1 day and 21 seconds. For the OP (evaluated in Korean TZ), the offset is 4 years, 9 hours and 42 seconds. |
How about changing the basedate to UTC?
|
So I have a file that's created in the Norway timezone and I'm in India and when I convert the dates using Dates in the sheet: 24.09.2020 Dates after doing this: const file = xlsx.readFile(filePath, {
cellDates: true
}); 2020-09-24T17:29:49.999Z There's a one day difference. The date 10/09/2020 is coming as 09/09/2020. 15th as 14 and 25th as 24. |
For my project i do not need any dates before year 2000 so i convert all the values of 1900 to 2000 and 1899 to 1999 in sheetjs code, that has one side affect that this problem has gone away but all my excel dates became 100 years ago, so i add 100 years to my javascript date objects like:
|
@SheetJSDev I am having the same issue my excel has |
Seems like Date parsing isnt good right now, only with different workarounds for post-update the data |
i also meet the same issue, here is my solution: when xlsx.read is called, when there is a date need to handle, i just add a '#' to the start and end of the date, to make it a regular string, when i call sheet_to_json, then i just remove the '#' to restore the original date. ex: date = '2022-01-01 12:40:33', when i call xlsx.read i trun it into '#2022-01-01 12:40:33#', when i call sheet_to_json i remove '#' and change it back to '2022-01-01 12:40:33'. just turn date to a regular string as 'abcd...' and change it back then. |
how to add ’#‘ |
export function SerializeDateTime(dt){ export function DeserializeDateTime(dt) { |
but when i use xlsx.read(data) from file,date is already changed. |
Following |
It seems that there is still no solution for this problem after 4 years. I'm from Vietnam and mine is 30 second deduct |
Can somebody help in this, |
Excel 用【文本】格式保存日期 解析就是正常的 如果必须解析【日期】or【时间】组件调用sheet_to_csv (没有秒数差异) 和sheet_to_json 数据拼接起来 日期时间就是完整的了 我感觉组件本身获取的时间戳应该是对的 sheet_to_json时候日期对象里面导致的时间差 |
It is normal for Excel to save date parsing in [text] format. If it is necessary to parse the [date] or [time] component, calling sheet_to_csv (without any difference in seconds) and combining the data of sheet_to_json, the date and time are complete. I feel that the timestamp obtained by the component itself should be correct. The time difference caused by the date object in sheet_to_json is correct. |
With
cellDates
read option, xlsx library tries to convert date-type cell to js Date object.However, it does not seem to respect
date1904
property of sheet when constructing js Date object. #126excel_date.xlsx
The above code with the attached excel file gives the following result:
I expected that the generated js Date objects are of '2019-01-01', but they are skewed due to
date1904
problem.I converted all js Date values in my program.
But I think It would be better that the library do this magical conversion so that users do not need to consider
date1904
anymore.Am I missing useful option?
The text was updated successfully, but these errors were encountered: