• Lua Code Compression
    15 replies, posted
I've recently run accross a rather interesting way that scripts on gmod could be made somewhat secure (though that's sorta a side effect if the real goal) while also being dramatically reduced in size when sending them to clients. It is conceivably possible that lua code could be parsed, converted into a parse tree, then converted again as condensed lua code with consistent formatting and shortened variable names etc as a means of both reducing code length and a deterrent to code stealers (obviously the code would still be runnable but not easily modifiable, would also make it harder for hackers to find exploits). It would even be conceivably possible to implement optimizations (since the code is effectively being compiled from a parse tree) to increase the performance of regenerated code. Optimizations such as allocating temporary variables outside loops could be implemented automatically aswell as automated caching of values from short computations where it can be guarantied that the result hasn't changed. Anyway this is my idea. I'm wondering what you guys think of it. So far it's mainly just an interesting notion that could config ably be impenetrable as a way of squeezing more performance out of gmod and also as a deterrent to those interested in stealing code / hacking. (Machine generated code is truly horrible stuff to work with. I'm thinking this would probably have to be implemented as a server side C++ module that overrides the behavior when adding client lua files and including files (something I'd actually have to doa fair bit of research into unless someone knows how it's done) with the actual file optimizer etc most likely coded in a mixture of lua and C++ to give users some control over it. Ideally it'd be cool to see such a feature implemented in the game though I imagine it'd have to prove itself as a useful standalone project first. Anuway... That's my idea... Thoughts...?
Feel free to add me on Steam; I've done a bunch of things in terms of attempting to make something secure without requiring the client to use a dll. The issue is, even if you have something securely sent, and block it from being cached by sending over net or http, there is still the issue of them redirecting RunString. The code will have to be run at some point using RunString in unencrypted form. So, you could obfuscate and run it like that. You could redirect runstring and have it decrypt and run the original. But, if the clients redirect the original, they still have a way to capture. If the clients monitor net traffic, they could find the http, or net code coming through. Even if the file is one file, etc etc etc...
There's never a way to completely secure anything sent to the client. The main advantage of this would be the performance gain, with the side effect of [u]added[/u] (not absolute) security.
I don't think any securements will be enough if we are going to use language that is limited to it's own environment. The only way to fully secure the code would be to implement something into engine, otherwise everything can be cracked using reverse engineering. That's just my opinion.
You guys sorta miss the point. Obfuscation is a side effect of what it's doing. It is essentially parsing the code and regenerating an optimized and shortened version which would, in the process of regenerating it, strip out comments and meaningful variable names and it would even complicate code to human readers by inserting optimizations but once again, the performance is te point not the security Also it wouldn't be using run string but rather I'd like to override the process that generates the lua cache and replace the source that goes into the lua cache with code processed by the optimizer. My main question is if this would be something would be interested in seeing done perhaps this summer as an open source project to generally improve the performance of gmod.
The real benefit of using some methods is that the 99% that don't know what they're doing and use a utility to extract won't be able to. They'd have to work to get it, and if you know how then you're probably skilled enough to write whatever it is you're trying to take. I'm pretty transparent with my code; I tutor a lot of people and when someone asks how I do something, I give them the basic principle and help out with some other cases. Yeah, if you use a dll you have many more options on how to load the Lua.
[QUOTE=thelastpenguin;44699735]You guys sorta miss the point. Obfuscation is a side effect of what it's doing. It is essentially parsing the code and regenerating an optimized and shortened version which would, in the process of regenerating it, strip out comments and meaningful variable names and it would even complicate code to human readers by inserting optimizations but once again, the performance is te point not the security Also it wouldn't be using run string but rather I'd like to override the process that generates the lua cache and replace the source that goes into the lua cache with code processed by the optimizer. My main question is if this would be something would be interested in seeing done perhaps this summer as an open source project to generally improve the performance of gmod.[/QUOTE] I don't really know how would you deal with the variable names. Let's say you got 2 scripts. Script A assigns a variable to player via: ply.var = 1 Script B uses the above variable for it's own stuff but it's no longer there. If you are going to use some sort of randomizer for variable names then Script B will no longer be able to access the variable, that is if you don't save the changed name somewhere which would require additional space which is the opposite of compression. If you are going to randomize it using an algorithm (like replace digits with numbers) then no matter what algorithm you use it'll be easy to crack using reverse engineering - ergo no security.
[QUOTE=Netheous;44699800]I don't really know how would you deal with the variable names. Let's say you got 2 scripts. Script A assigns a variable to player via: ply.var = 1 Script B uses the above variable for it's own stuff but it's no longer there. If you are going to use some sort of randomizer for variable names then Script B will no longer be able to access the variable, that is if you don't save the changed name somewhere which would require additional space which is the opposite of compression. If you are going to randomize it using an algorithm (like replace digits with numbers) then no matter what algorithm you use it'll be easy to crack using reverse engineering - ergo no security.[/QUOTE] I suppose you'd only be able to obfuscate local variables and not global variables or variables set on players/entities etc.
Once again, not trying to obfuscate... I really shouldn't have mentioned that in the OP... but the goal is to essentially compile lua code and apply optimizations in the process. As for fields, they probably just would just keep their initial names since renaming them would be outside the range of optimizations the script compiler could make. A properly designed compiler would only perform optimizations when it has enough information to check that the change won't violate any of the possible algorithm states. I'm not your typical skiddy who wants to turn his code into the da vinci cypher xD.
The Lua compiler already optimizes the byte code pretty well where it can. The only thing reducing the lua file size might do is reduce the time it takes to lex and parse the file, and even then you won't see much of a speed increase - even if you have very good reduction ratios. Having shorter variable names does not affect performance (upvalues are indexed by integer and not by string inside the byte code).
[QUOTE=thomasfn;44699876]The Lua compiler already optimizes the byte code pretty well where it can. The only thing reducing the lua file size might do is reduce the time it takes to lex and parse the file, and even then you won't see much of a speed increase - even if you have very good reduction ratios. Having shorter variable names does not affect performance (upvalues are indexed by integer and not by string inside the byte code).[/QUOTE] I wasn't actually aware the lua bytecode compiler actually performed any optimizations... do you know if there is any documentation on what it does do? I think the primary places where greater performance could be achieved would probably be function in-lining and replacing upvalues that can be resolved to values at load with their numeric values which would be particularly beneficial due to the common practice of using locals as constants.
[QUOTE=thelastpenguin;44699928]I wasn't actually aware the lua bytecode compiler actually performed any optimizations... do you know if there is any documentation on what it does do? I think the primary places where greater performance could be achieved would probably be function in-lining and replacing upvalues that can be resolved to values at load with their numeric values which would be particularly beneficial due to the common practice of using locals as constants.[/QUOTE] I'm not sure exactly what it does, I've never really needed to look into it. Probably the more obvious stuff like removing unnecessary stack operations (like popping a value and pushing it straight back again) and things like that.
[QUOTE=thelastpenguin;44699928]I wasn't actually aware the lua bytecode compiler actually performed any optimizations... do you know if there is any documentation on what it does do? I think the primary places where greater performance could be achieved would probably be function in-lining and replacing upvalues that can be resolved to values at load with their numeric values which would be particularly beneficial due to the common practice of using locals as constants.[/QUOTE] [url]http://www.lua.org/manual/4.0/luac.html[/url]
A few points -I'm not really sure why everyone is so concerned with Lua file compression. As far as I've seen, it's not the Lua files that take up the bulk of a server/games addons, its all of the other content. -As long as you don't have shit code speed shouldn't be an issue, ergo don't rely on another tool to ensure that you're coding correctly, that should be done in the first place. -Why are you so concerned with "code security"? The fact that GMod HAS this openness in its addon style is beneficial to everyone imo. Yeah, I hate when people steal my code and claim it as theirs, it's happened to me quite a few times over the years, but if we're every going to try to get this back to where it was when I first joined (supportive and everyone seemed to not be such a dickbag), then why not let people see your code so they can reference it and learn from it? Obfuscating things further than you need to really is just over complicating a lot of what GMod already is does and is really quite unnecessary. If you're so concerned with micro-optimizations to keep things running faster, do it yourself on a file to file basis. Just my 2 cents.
Just found this article: [url]http://article.gmane.org/gmane.comp.lang.lua.general/58908[/url] Apparently LuaJIT does in fact do a significant amount of optimization for us, though finding out about exactly what it is doing is a bitch by the author's own admission. In any case it seems you guys may be right that this level of precompilation is possibly unnecessary at least with luajit but it is a pretty fascinating read in terms of what it's doing behind the scenes if you're really looking to maximise the speed you can get out of your code by writing it for the jit optimizer. The three most remarkable optimizations it seems to be doing are: - inlining of short trip loops (that only iterate a few times) - reduction from doubles to ints to optimize arithmatic by distributing work onto integer units of the CPU in cases where it can predict that a value will never have a decimal value. (I think this is part of why bitshifting can be made extremely fast) - Lastly it uses a special case to make indexing tables like tbl.key is equivalent to an array index in terms of overhead so you don't have to worry that it's doing a full hash lookup anymore.
Lua files are compressed when they are sent to clients, no need to compress them more.
Sorry, you need to Log In to post a reply to this thread.