vON: Vercas' Object Notation [Release for developers]

GitHub Repository

[H2]Why?[/H2]
Well, the current serialization method available in GMod - GLON - is slow. And some trials with (extremely) complex data structures have failed miserably.
I also don’t like the idea of using non-printable characters.
What I was using before writing vON was JSON. But JSON had a problem.
I couldn’t find a solid parser. The best one that I found had some miserable failures in the end.
Also the JSON implementations are bound to the JSON standard (obviously), so they don’t support things like booleans or even tables as keys.
[HR][/HR]
[H2]Specifications[/H2]
As in Lua, the “main” data structure is the table. But, unlike Lua tables, they are separated in two parts: the numeric (array) component and the key:value pairs (dictionary) component.
Tables start with { and end with }, and the two components are separated by a ~ (tilda) character, which may be absent if the table is a pure array.
Examples: Format: { … ~ … } and data: { “lolv” “maov” ~ “lolv”:“maov” }, which would be {“lol”,“mao”,lol=“mao”} in Lua.

The first table in the data, the chunk has no initial and final characters (they are useless).
Example: “lolv”“maov”~“lolv”:“maov”, which means the same thing as the example above enclosed in { }.

Also, for the sake of parsing speed, some types (such as boolean and numbers) are prefixed with a character. If no valid prefix character is found, the last type will be automatically used.
Inside tables, spaces, tabs and newlines are simply ignored when deserializing.

Numbers are currently declared like this: n… ( represents the value in base 10). They either end in ;, },
(newline), : or ~
. (At least one must be present).
Example: n1;2;3~4:4; and {n1;2;"intruder!v"n4}.
If I get enough moral support, I will add a compressor for the numbers!

Booleans are prefixed by b and are represented either by a 1 (true) or 0 (false). They are represented by a single character so they don’t have a delimiter.
Boolean flags usually look like this: b101101001.

Strings start and being with double quotes ("). Quotes inside strings are escaped with a “”. Only quotes are escaped now.
A “v” is added at the end of every string to make sure the string doesn’t end in a “” (which would break the deserializer).
That’s 1 character more per string, but it whould be a fair sacrifice considering the speeds at which it deserializes strings. (Especially strings of kilobytes in size)

This was written in pure Lua for Windows. (It will obviously work on other OSes too!)
GMod-specific types are included in a different version.

In vON, keys and values can be anything.
You can even have booleans and tables as keys!

[h2]Code[/h2]

I’m not going to dump the code here.
A code file for comparing this with GLON and dkJSON is available here. (Might be outdated, so replace the vON code in that one with the latest release.)

Latest stable versions:

Development versions:

  • Pure Lua
  • GLua
    These are not to be used in any projects. They’re the files I work and test on before updating the stable versions.
    They’re here for the curious to peek.

[h2]Comparison[/h2]

It’s more than twice as fast and occupies less space than GLON.
Unlike JSON, it’s not bound to a standard, like all keys in tables being strings.

I’ve tested vON with the following piece of code:
[lua]local test6 = {
1, -1337, -99.99, 2, 3, 100, 101, 121, 143, 144, “ma"ra”, “are”, “mere”,
{
500,600,700,800,900,9001,
TROLOLOLOLOLOOOO = 666,
[true] = false,
[false] = “lol?”,
pere = true,
[1997] = “vasile”,
[{ [true] = false, [false] = true }] = { [true] = “true”, [“false”] = false }
},
true, false, false, true, false, true, true, false, true,
[1337] = 1338,
mara = “are”,
mere = false,
[true] = false,
[{ [true] = false, [false] = true }] = { [true] = “true”, [“false”] = false }
}

local last = test6

for i = 1, 5000 do
local s = von.serialize(last)

--print(to_string(t, 0))
--print(s)

last = von.deserialize(s)

end

print(von.serialize(last))[/lua]
vON produced a flawless result.
GLON produced a disaster…
JSON cannot encode that.

The code in the linked comparison outputted the following for me:


GLON:
	5000 encoding/decoding successions took 4.2862453460693 seconds to finish.
	Length of final (probably overmutilated) data: 597.

JSON:
	5000 encoding/decoding successions took 2.1321220397949 seconds to finish.
	Length of the final (100% mutilated) data: 545.

vON:
	5000 encoding/decoding successions took 1.4760837554932 seconds to finish.
	Length of the final (100% healthy) data: 583.
[Finished in 7.9s]

[h2]Examples[/h2]

You must be aware of the uglyness of the code already.

The code above produces:


n1;-1337;-99.99;2;3;100;101;121;143;144;"ma\"ra""are""mere"{n500;600;700;800;900;9001~"TROLOLOLOLOLOOOO":n666;b1:0{~b0:11:0}:{~b1:"true""false":b0}"pere":b1n1997:"vasile"b0:"lol?"}b100101101~1:0{~b0:11:0}:{~b1:"true""false":b0}n1337:1338;"mara":"are""mere":b0

Yeah, it looks like s**t.

Previously, vON featured a “nice mode” for formatting the code for human readability.
Well, it’s no longer supported. It was slow and a pain in the arse to maintain.

[h2]Usage[/h2]
[lua]von – The global table of the library.

von.serialize – A “functable” containing the serialization procedures.
– The internal functions are exposed because someone might find a use for 'em.
von.serialize(table data) – Serializes the table into a string.

von.deserialize – A “functable” containing the deserialization procedures.
– The internal functions are, again, exposed, because someone might find them useful.
von.deserialize(string data) – Deserializes the specified data into a table.
– (Non-numeric) key order might not be preserved, but it shouldn’t matter.[/lua]

[h2]Final thoughts[/h2]

Credits and appreciations are in the file.
This is not intended to replace or work with anything in particular. I made this for myself, for my gamemode, to store data in a quick, flexible and, thus, efficient way.
If you don’t like it, either suggest an improvement or leave the thread. I don’t want or need hateful opinions.

Also, I apologize for the bad-ish formatting of the thread…
And I apologize if my text sounds hateful/mean. It’s just my way of writing, it’s not on purpose. :smile:

By publishing this I’m getting no profit. My only intention is to help, to make someone’s day a little brighter. Please read the notice in the code file and do as it says.

[h2]If you have ideas for improvement, optimizations or bug reports, please post them in this thread![/h2]

[h2]Changelog[/h2]

All times are GMT.

2012.07.06 8:20 - Version 1.0.0

  • Started making a changelog.
  • Declared version 1.0.0

2012.08.02 8:55 - Version 1.1.0

  • Fixed errors on Angle and Vector deserialization saying they’re entities.
  • Added errors when trying to (de)serialize the wong types.
  • Fixed GLua version’s distribution link pointing to the pure Lua version.
  • Added Player data type to the GLua version. Players are saved by their entity ID.
  • Removed two redundant arguments in the deserialization functable.

2013.09.28 15:53 - Version 1.1.1

  • Fixed handling of nil values when deserializing array components. Thanks to Chessnut for pointing out the bug.

This looks pretty cool. I’d definitely use it over glon any day.

Going to use this in my gamemode :wink:

No more string.Explode!

Well, uh, yeah?
It’s written in pure Lua. :wink:

Looks excellent, will be using this in my gamemode as well.

Also add “?dl=1” to your download links so they autodownload when someone clicks them

I’m so glad you like it!

I’ll add GMod-specific types tomorrow, and I’m looking into a way to compress the output string, or at least the numeric types…

Thank you!

You did great job, vercas.
I lov you. No homo.

Thanks! :smile:

[hr] [/hr]

I’m looking for some input now.
So, how should I store angles? Should I reduce them to [0; 360] or [-180;180]? Should I convert them to numbers like pitch360360 + yaw*360 + roll? In [0, 360] the number will vary from 0 to 46,785,960. In [-180, 180] the number will vary from -23,392,980 to 23,392,980.
Using the interval [0, 360] will cut down on one character.
Most of the times, the numeric representation will be smaller than the values together. 46785960 is smaller than 360,360,360 by 3 characters…

[editline]asd[/editline] I just realized what a retarded thing I suggested above.

Angle value is 360 to 0 as i know.

Well, many functions actually return angles in the [-180; 180] range. (At least in E2 :v:)

Angles are -180 to 180 (Although it’ll tend to get clamped, so it’s best using 179)

Hm.

Now that I’ve refreshed my mind, it’s not such a good idea to add up the angles to a number because of the decimals. It was actually a very dumb idea…

Damn, angles and vectors are going to be big…

I think I will clamp all numbers to 3 decimals.
Precision decreases dramatically after the 2nd decimal anyway.

Oops, sorry then.

Sorry for?

Anyway, does anyone fancy big numbers?
I’m using this library for a while and it’s actually awesome. I could easily implement them (as strings…) in vON as a separate type.

Oh and btw, roll is measured -45 to 45 iirc.

Nice work.

For some big speed increases, I suggest using T[#T+1] = Val instead of insert(T,val)

Also at the top of the file it says “authot”

EDIT: Oh and this


local stuff = { ["\\"] = "\\\\", ["\""] = "\\\"" }
gsub(data, stuff)

might be faster than


gsub(gsub(data, "\\", "\\\\"), "\"", "\\\"")

Very interesting, I may have to rewrite a lot of the code in the gamemode I’m working on because of this new discovery…

This is badass. Will be using from now on. Thanks.

Thanks! I fixed the typo and used T[#T + 1] as you suggested.
It cut about 0.1 seconds in the benchmark. Now it’s definitely faster than JSON!

Um… This doesn’t even work. It doesn’t accept tables as the second argument.