[json-processing-spec users] Re: Introduction and some initial user thoughts.
- From: Tatu Saloranta <
- Subject: [json-processing-spec users] Re: Introduction and some initial user thoughts.
- Date: Sun, 25 Mar 2012 14:54:45 -0700
On Sun, Mar 25, 2012 at 2:32 PM, David Illsley
> On 25 Mar 2012, at 20:30, Jitendra Kotamraju wrote:
>> I have some more internal feedback which came in late. I put that before
>> bringing to the EG. Then we can discuss further.
>> Tatu Saloranta wrote:
>>> On Sat, Mar 24, 2012 at 12:13 PM, David Illsley
>>> One question here is whether chaining is purely for convenience,
>>> returning writer itself; or create a new write context. I assume this
>>> would be just former, which is more light-weight (although latter has
>>> its benefits too, since they can be more safely passed).
> In this case I went for a new object for each new JSON object/array. Yes,
> there's an overhead, but for simple, small documents, developer
> productivity/clarity outweights that (IMO). Being able to use
> content assist to see only valid completions is something I particularly
> like with this approach.
> I'm not suggesting the main low-level parser API should be like this, but a
> low-level API which everyone wraps isn't a huge improvement from the status
Right, and I used such an approach with StaxMate
I do like it. StaxMate does about this on basic Stax APIL both for
parsing and generation.
One follow-up question, then, is how many APIs should we provide;
incremental (highest performance), tree model, builder-based?
Also: this also goes to question on how is the target group: my
experience with Stax API, for example, suggests that in many cases
it's not end developers that are the majority of users.
This may or may not be the case here. But framework developers have
quite different needs than app developers.
>>>> .string("firstName", "John")
>>> I assume 'string()' here means "write an Object property with name
>>> 'firstName', and value "John"?
>>> This only works in Object context (i.e. when there is an open object);
>>> for array contexts we need separate set of method. Here it'd be
>>> possibly to just take one argument.
> Yeah. I'm using different interfaces for writing an object and writing an
> array... see
Ok yes, makes sense then.
>>> But more generally: should it be possible to separately write field
>>> name, value (in which case value write method can be shared for all
>>> contexts), or always require field name.
>>> Requiring field name can reduce cases of possible errors; but it makes
>>> some delegation cases more difficult, or requires caller to always
>>> pass field name (specifically, with object serialization value
>>> serializers are separate, but someone has to write field name too).
>>>> 2. Numbers.
>>>> In order to explore the problem space, I knocked together my own
>>>> API/parser a little while ago, and I kindof punted on dealing with
>>>> numbers because it's not obvious what to do. On the writer-side, simply
>>>> taking Number seems to work. On the reader side, particularly for
>>>> application (vs library)
>>> It works, but it's bit wasteful. Since goal of low level generator is
>>> to keep overhead reasonably low, it may make sense to avoid wrappers.
>>> So I would prefer exposing primitives too, although can also allow
>>> Number for convenience.
>>>> users, the user probably knows what precision they want, and are happy
>>>> to ask for it. Expecting
>>>> users to do some if/else instanceof tree + conversion every time they
>>>> get a number seems like a poor solution.
>>> I agree; there should be way(s) for caller to indicate kind of number
>>> they want. They are many follow up questions, such as if and how to
>>> deal with over/underflows.
>>> Also, I like the idea of exposing some kind of type-detection method,
>>> as suggested later on.
>>> Parser can easily detect basic integer/floating-point difference from
>>> syntactic representation at least.
>>>> One option which appeals to me is to add asInt() asFloat() asDouble()
>>>> etc to JsonNumber. The would simply do the bolierplate transformations,
>>>> possibly losing precision, but giving the app developer what they want.
>>> I think asInt() etc can be exposed by parser/reader object directly.
>>> But maybe you are thinking of JsonNumber as being part of tree model?
>>> (my understanding is that there is lower level incremental parser, and
>>> tree model would be layer above it)
>>> For that these methods would make sense too.
> Agreed. I guess I was thinking that people working at the low-level
> mightn't be as bothered about having to check types and are more likely to
> be library developers than app developers.
>>>> b. For the library/process arbitrary document case, is there a
>>>> performant, concensus algorithm to decide which of
>>>> float/double/BigDecimal to use, and is specification of one in-scope for
>>>> this JSR?
>>> There are existing implementations; Jackson (which I wrote) has its
>>> own handling, exposing "smallest necessary" type for integrals
>>> (int/long/BigInteger). And requiring caller to ask for floating point
>>> type; parsing of numeric values is lazy, syntactic checks are eager.
>>> That is, parser can verify syntactic validity of numbers first, but
>>> defer expensive decoding of floating-point numbers until expected type
>>> is known.
>>> Others can suggest other methods I'm sure.
>>>> 3. Tree navigation, instanceof, and nulls.
>>>> My main pain points with the XML DOM are around constant instanceof
>>>> checks and null-checks when navigating the tree.
>>>> When you know what you're looking for, XPath is pretty much required to
>>>> keep your code from becoming spaghetti. I see a similar danger of
>>>> requiring constand null-checks naviagting a tree here.
>>>> What are your thoughts on having some kind of path api (could be as
>>>> simple as <T extends JsonValue> JsonObject.getPath(String path, Class<T>
>>>> clazz) to make it simple to grab some nested information?
>>> While path expressions are useful and powerful, I think path
>>> expression language is out of scope.
>>> Especially since while there are multiple experimental languages
>>> (JSONPath, JSONQuery at least), none is really widely used as far as I
>>> That is, defining path/expression language seems modular add-on piece;
>>> and once something standard exists, then Java bindings should follow
>>> naturally. But this is not yet the case.
> I understand the concern, and maybe it's better as a standalone class in
> the API, but I am keen that there's a way to extract information from the
> tree model that's easy to write and easy to read. Nested null checks don't
> lead to attractive, easy to read/understand code.
Yes, but path expressions are not a requirement for avoiding null or
One alternative is to expose virtual "null nodes", which can be
traversed, but basically always evaluate to empty value. In addition,
you can also add "auto-build", wherein you can actually build the path
as you logically traverse ("if you need an Object property for 'x',
and there isn't one, one will be added").
I added both to Jackson's tree model; and while naming may not be
optimal, concept is pretty useful:
1. "get" does not do checks, and result is null if no matching
property or element exists.
2. "path" produces virtual null nodes as necessary (path("property")
or path(1) for element)
3. "with" adds necessary property/element when used for traversal;
except if a property/element already exists it is either used (if
expected type), or an exception is thrown (if incompatible type)
So I guess this is sort of "path methods" versus "path expressions".
I think both are useful, and I would love to have usable real
expression system. I just don't know of one that exists in usable
>>>> instanceof checks and casting also seem to be common with the current
>>>> API if you're walking an arbitrary tree and inspecting the contents. I
>>>> know this is less common with JSON than with XML, but I do think it's
>>>> worth making easier.
>>> Right, I would want to minimize (if not possible to eliminate) such
>>> With Jackson, division of methods was such that all read methods are
>>> exposed at root JsonNode level, but modifications of structured types
>>> (adding properties to Objects, elements to Arrays) are exposed only at
>>> specific sub-type.
>>> (javadocs can be seen at
>>> [http://jackson.codehaus.org/1.9.4/javadoc/index.html], see JsonNode
>>> and subtypes).
>>> Many JSON packages implement their own view of Tree Model, so this is
>>> rather well experimented area.
>>>> What do you think of adding a JsonValueTypeEnum getType() to JsonValue?
>>> This could work, but it is also possible to have distinct
>>> 'isNumber()', 'isArray()' etc methods, all implemented by all
>>> But I think such enumeration definitely makes sense for incremental
> I'm personally a fan of smaller APIs, and so an enum based approach appeals
> to me.
I guess it goes to matter of taste (procedural vs OO, minimal
numbers), I don't have strong preference actually.
One possibility is to use Enums only for lowest level API; and more
convenient (IMO) isXxx() methods for tree model.
-+ Tatu +-