Skip to content

JSON issues with U+FEFF #300

Open
Open
@kts

Description

@kts

Description

If we have a JSON file with the following 8 bytes, "\ufeff" (including the quotes), JSONSerialization decodes this as an empty string, rather than as String("\u{FEFF}")

(Note this is valid JSON and not JSON data starting with a BOM (byte order mark). All 8 bytes are ASCII.)

Steps to reproduce

import Foundation

let jsonString = #""\ufeff""#

let obj = try! JSONSerialization.jsonObject(
  with: jsonString.data(using:.utf8)!,
  options: [.allowFragments])

let string = obj as! String
print(string.count) //=>0. expected 1

Expected behavior

I would expect that for any String value, that the "round-trip" JSON encode + JSON decode should give you back an equal string, but here is an example where it does not:

import Foundation

let string1 = String("\u{FEFF}")
print(string1.count)//=>1

let jsonData = try! JSONEncoder().encode(string1)

let string2 = try! JSONDecoder().decode(String.self, from: jsonData)
print(string2.count)//=>0. expected string2==string1

Note, here jsonData is the empty string encoded as JSON ("").

As someone pointed out in this forum post, these problems arise from different behaviors of NSString and String,

import Foundation
print("\u{FEFF}".count) // => 1
print(("\u{FEFF}" as NSString).length) // => 0

Environment

$ swiftc -version
swift-driver version: 1.75.2 Apple Swift version 5.8.1 (swiftlang-5.8.0.124.5 clang-1403.0.22.11.100)
Target: arm64-apple-macosx13.0

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions