You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is to report that I am reading a sas dataset mentioned below, which has 6 columns. The first column is "YEAR" which is of numeric type with length 4, does not have precision, yet it is getting converted to Double.
While testing using java program with parso library, I could identify that columnFormatWidth for the column is coming as 0. In the saurfang source code I found the following excerpt where it checks columnFormatWidth to decide whether the column is of /Int/Short/Long type
"""
// Map SAS column types to Spark types.
val columnSparkType: DataType = {
if (columnClass == classOf[Number]) {
if (DATE_TIME_FORMAT_STRINGS.contains(columnFormatName)) {
TimestampType
} else if (DATE_FORMAT_STRINGS.contains(columnFormatName)) {
DateType
} else if (columnFormatPrecision == 0 && columnFormatWidth != 0) {
columnLength match {
case l if (inferShort && l <= 2) => ShortType
case l if (inferInt && l <= 4) => IntegerType
case l if (inferLong && l <= 8) => LongType
case _ => DoubleType
}
} else if (inferDecimal && columnFormatPrecision >= 1 && columnFormatWidth != 0) {
DecimalType(inferDecimalScale.getOrElse(columnFormatWidth), columnFormatPrecision)
} else if (inferFloat && columnLength <= 4) {
FloatType
} else {
DoubleType
}
} else {
StringType
}
}
"""
Please help to understand the significance of columnFormatWidth to decide the datatype. Also whether this is a known issue/limitation.
Thanks in advance.
The text was updated successfully, but these errors were encountered:
@ayan2k21 review the docs for the parameters in the readme, it should become more clear.
By default, everything is a double, but if you enable parameters like inferLong or inferInt you will get long/int depending on the format width/precion.
I am having the same problem. Even with inferLong true and a SAS column numeric data type length of 8, format of 8. and informat of 8., it still converts to double. Even numeric columns with a length less than 8 with no decimals convert to double. Please help.
Edit: Using 3.0.0-s_2.12 and Parso 2.0.11 (because 2.0.14 is broken)
This is to report that I am reading a sas dataset mentioned below, which has 6 columns. The first column is "YEAR" which is of numeric type with length 4, does not have precision, yet it is getting converted to Double.
http://www.principlesofeconometrics.com/sas/airline.sas7bdat
While testing using java program with parso library, I could identify that columnFormatWidth for the column is coming as 0. In the saurfang source code I found the following excerpt where it checks columnFormatWidth to decide whether the column is of /Int/Short/Long type
"""
// Map SAS column types to Spark types.
val columnSparkType: DataType = {
if (columnClass == classOf[Number]) {
if (DATE_TIME_FORMAT_STRINGS.contains(columnFormatName)) {
TimestampType
} else if (DATE_FORMAT_STRINGS.contains(columnFormatName)) {
DateType
} else if (columnFormatPrecision == 0 && columnFormatWidth != 0) {
columnLength match {
case l if (inferShort && l <= 2) => ShortType
case l if (inferInt && l <= 4) => IntegerType
case l if (inferLong && l <= 8) => LongType
case _ => DoubleType
}
} else if (inferDecimal && columnFormatPrecision >= 1 && columnFormatWidth != 0) {
DecimalType(inferDecimalScale.getOrElse(columnFormatWidth), columnFormatPrecision)
} else if (inferFloat && columnLength <= 4) {
FloatType
} else {
DoubleType
}
} else {
StringType
}
}
"""
Please help to understand the significance of columnFormatWidth to decide the datatype. Also whether this is a known issue/limitation.
Thanks in advance.
The text was updated successfully, but these errors were encountered: