It addresses the following points: 1. What is obfuscation? 2. Why use obfuscation? 3. How to evaluate obfuscation? 4. What are the main transformations?
Obfuscation is the process of transforming a program code into a form that's more difficult to understand or change. It must not be mistaken with minification which is simply the fact to decrease the size of a file. Although minification makes the file unreadable, it's extremly easy to give it a normal look, and then to understand or modify the code at ease.
More formally, obfuscation can be defined as the process of transforming a program P into a program P' such as P and P' have the same observable behavior. In particular, if the original program P doesn't terminate or terminate with an error, then its obfuscated version P' may or may not terminate. However, in case of an original program P that terminates, it's obfuscated version P' must returns the same value as P.
The notion of observable behavior is important. Indeed, strictly speaking, P and P' won't have the same exact behavior since as we will see in the next part, some obfuscation techniques consist in changing the execution flow of the program by reordering variables or functions for example. This notion of observable behavior means that P and P' may have different behaviors, but these different should not be perceived by the end user.
Although encryption may seem more effective than encryption, it is of no interest if the code is decrypted on the client's machine since there will still be a moment where the client will be able to see the unencrypted version of the code. However, an other solution to protect the code is to move the critical logic that needs to be protected to the server side. By providing an API that makes remote calls to the server containing the critical logic, the client has no longer access to the code. This solutions adds overhead both for the client since it is dependent of network conditions, and also for the company that provides the API since with this solution it needs to provides reliable servers, whereas with obfuscation, code is executed on the client device.
This part summarizes how to characterize the efficiency of obfuscation, as well as its impact on the program that has been obfuscated. The evaluation process is split in three components that respectively answer to the following questions: * How much obscurity has been added to the program? (human) * How difficult is it to automatize the deobfuscation process? (machine) * How much computational overhead is being added by the obfuscation
Potency aims at measuring how the obfuscation process makes it more difficult for a human to understand the obfuscated program. In order to measure if a new obfuscated programm P' is less readable than the original program P, we use metrics from software engineering. These metrics reflect the quality, readability and maintainability of a program. Traditionnaly, the goal is to optimize these metrics so that a program become easier to maintain and read. In case of obfuscation, the goal is to worsen these metrics in the opposite direction so that the program becomes more obscure. For example, when usually one would try to minimize the program's size or variable dependencies, in case of obfuscation the goal is to maximize them.
Resilience aims at measuring how well an obfuscated program can resist to a deobfuscator. In contrary to potency which measures the complexity for a human, resilience measures the complexity added for a machine. It can be seen as the time required for a deobfuscator to reduce the potency of the obfuscated program, i.e. the time needed to make it more readable by a human.
Thus, the global quality of an obfuscation can be seen as a function of the potency (confusing a human), resilience (confusing a deobfuscator), and the overhead it introduces (execution time and size of the files).
In order to obfuscate a program, an obfuscator applies different kinds of transformations on the original program. We present these transformations in three categories depending on their target.
Layout transformations change the visual representation of the program. They are one way since once the transformation has been made, the previous state can't be recovered. In this category we find simple transformations such as renaming variables, functions, or removing comments. By doing this kind of transformation, it removes the semantic contained in the name of variables, or the indications present in the comments. We also find operations such as changing the code formatting. It generally consists in removing spaces and lines, which is one the reason why minification is often mistaken with obfuscation.
We give an example on a simple program:
The original program could be transformed in this new program below:
Control transformations aims at obsuring the control flow of the application. These transformations are separated in three subcategories presented hereafter.
Aggregation transformations consist in separating computations that are logically together, or merge code that are not. This is based on the idea that code that has been aggregated in a function or a class by a programmer must probably be linked. By separating it into different functions or classes, it makes it more difficult to understand. On the opposite, merging unrelated blocks of code together makes it look like they have a semantic link.
Ordering transformations randomize the order of instruction execution. It relies on the idea that spatial locality in the code, i.e. functions or statements close in the code, plays an important role in making the code more understandeable.
Computation transformations insert dead, redundant code, or make algorithmic changes. Their goal is to hide the real control flow by polluting it with irrelevant statements. In contrary to transformations that target the visual representation of the program, this set of transformations may have an impact on the execution time of the program (cost metric).
Example below presents how we could transform the previous sum function by adding dead code and altering the control flow, without changing its final result.
A second target of transformations are data structures. They are classified in three categories:
This strategy relies on using a non "natural" way to encode or store data. For example, we can replace a boolean variable by two variables and a mapping function used to reconstruct the original value. If a variable v = false, we can represent it with the tuple (false, false, AND), (false, true, AND), or (true, false, AND) where AND is the && operator). By using this tuple of 3 elements, we can recompute the final value. Besides boolean variables, this technique can be generalized to other type of variables.
Example below illustrate how we can split boolean variables into a tuple of 3 elements:
An other technique often used on string is to not use the raw string directly, but instead convert the string into a program that produces the string.
Aggregation transformation aim at aggregating data structures to hide their original representation. For example, it may operate on arrays by restructuring them. An array can be split into multiple sub arrays, multiple arrays can be merged into one. A one dimensional array can be folded into a higher dimensional array, or the opposite, called flattening, which is the process of decreasing the dimension of an array. For example, by representing a 2D grid using a 1D array, it makes it more difficult to understand the purpose of the variable.
Besides arrays, these transformations may also focus on inheritance relationships. It may increase the level of inheritance, or on the opposite decrease it by merging subclasses.
Example below presents how to flatten a 2D array to a 1D array:
Ordering transformations randomize the order of declarations in the source code. In particular, the order of methods and instances variables of a class, as well as parameters in functions. It is also possible to randomize the order in which elements are stored in a data by providing a function that given an index i, maps it back to its original position in the array.
Table below, taken from A taxonomy of obfuscation transformations, provides an overview of the different obfucating transformations as well as their impact on the potency, the resilience and the cost. A "+" character indicates that the value of the metric depends on the context.
|Layout||Scramble identifiers||medium||one way||free|
|Change Formatting||low||one way||free|
|Remove Comments||high||one way||free|
|Control||Computations||Insert dead code||Depends on the quality of the opaque predicate and
the nesting depth at which the construct is inserted.
|Extend loop condition|
|Reducible to non reducible|
|Add redundant operands|
|Remove programming idioms||medium||strong||+|
|Aggregation||Inline method||medium||one way||free|
|Interleave methods||Depends on the quality of the opaque predicate.|
|Ordering||Reorder statements||low||one way||free|
|Reorder loops||low||one way||free|
|Reorder expressions||low||one way||free|
|Data||Storage and encoding||Change encoding||Depends on the complexity of the encoding function.|
|Promote scalar to object||low||strong||free|
|Change variable lifetime||low||strong||free|
|Split variable||Depends on the number of variables into which the
original variable is split.
|Convert static to procedural data||Depends on the complexity of the generated function.|
|Aggregation||Merge scalar variables||low||weak||free|
|Insert bogus class||medium||+||free|
|Ordering||Reorder methods and instance variables||low||one way||free|
Some transformations present in the table have not been presented in this post, but their name is rather explicit about their purpose. This post also didn't tackle the complexity of creating efficient opaque predicates capable of resisting to deobfuscators. These different points may be treated in future posts on obfuscation. Meanwhile, you can refer to this article for more details.Read More
I started running into a weird problem that I didn't immediately identify. Some programs just started failing: libreoffice, chrome, chromium, firefox, eclipse, ...
It was quite undeterministic and depended on the system ressources being fairly well used. I thought it was a RAM issue, not having enough memory would cause programs to fail. I have somewhat agressive ram settings, but, I also have 16 GB of RAM on my computer. Well, it wasn't a memory issue, I was able to reproduce the issue with loads of memory still left over.
Here's some of the messages I was getting.
Libreoffice: (similar bug here)
Chrome and Chromium:Read More
I tried this on my Dell e6530 and it seems to work pretty good.
For video acceleration :
Override software rendering list → Set to
This works for both Chrome and Chromium.Read More
I really would like a PDF reader that saves the open pdf files in tabs, like a browser does. It would let me get back to reading whatever I was reading after a restart. Currently, I try to restart as little as possible, usually between 20 and 40 days in order to save all my open stuff. I would use Mendeley except the open tabs feature request has been open since 2009 and not implemented yet!!! http://feedback.mendeley.com/forums/4941-general/suggestions/263198-remember-open-tabs-and-position-within-pdfs.
My answer for PDFs is to use my browser to store my open files. It might be a little overkill but it does the job decent enough. I mainly use Firefox + mozplugger + evince, but I also use the Chrome and Chromium browsers for different things including reading PDFs. Chrome has a simple PDF plugin that I like and its pretty fast, but Chromium doesn't have it because of licensing issues.Read More
I find the touchpad on my laptop way to sensitive by default. To change it, I've found the following settings comfortable:
I also have a wireless mouse that I sometimes use. It's way to sensitive. The following command makes it useable:Read More
I use acrobat reader sometimes, like when people send me an annotated pdf or if Evince doesn't print it properly. My main pdf reader is Evince, but it's a bit buggy. Also, I use okular to annotate pdfs, which it does a fine job at.
Anyhow, I was getting the following error after applying some updates:
It took me a while to track down the problem, which has a simple solution, just find the missing symbolic link. I ran into this page that describes the issue but nobody had provided the proper command http://forums.fedoraforum.org/showthread.php?t=297151.Read More
I really like the 'alternative' french keyboard layout on my computer, a.k.a fr-oss. I like to use the deadkeys for accents in french and spanish, and I really like the alt-gr + ctrl# for doing arrows and stuff like that.
Earlier this year on my Fedora laptop (F20) for work the right ctrl key stopped working. It sucked because ctrl-arrows no longer jump through words in text documents. VLC right ctrl+Up/Down no longer changed the volume either, making me have to use both hands just for that. I figured it was a bug in Fedora so I found a neat command to fix the mapping while I waited for the problem to be fixed. However, some time later it happened to my wife's Ubuntu laptop and looking into it I found that the problem is comes from higher up the chain.
Someone decided to map right ctrl to something "new" instead of letting it be the same as the left ctrl key. Sure, if programs used the "new right control" it might be great and super useful. But they don't, so the key does nothing. Kind of beats the purpose, huh?
To fix the problem, here's the command you can run.
My Dad is a wonderful person and loves his children a lot, but as of late, he's decided to torture me with pictures from the beautiful Pacific coast of Mexico. Anybody who's been to California knows just how much sun there is. All. Year. Round.
Baja California is no different from California in that respect (well, it's even better IMHO).
That's all nice and all but I haven't seen the sun since fall started, back in October 2013 (wooooo, so long ago). In Rennes, France−the region of Brittany−we get the same shitty weather that the english get, give or take a less-cloudy day or two. So, how do you think I feel when Dad sends me these beautiful pictures? It makes me wanna cry. I feel nostalgic thinking of all of our fishing trips. I miss the food. I miss the sun. I miss the people. I miss a whole lot of things.
For example, here's a picture from last December.Read More
I ran into a weird bug that took me about 50 test emails to understand. Thunderbird appears to automatically convert your plain-text emails from one character set to another automatically if it detects that there is an incompatible character.
However, the conversion isn't as smooth as it should be. In fact, in my case, all leading spaces in my neatly formatted plain-text format=flowed emails disappear.Read More
This post is a test for markdown and Jekyll. I basically have all kinds of examples of how to do stuff in markdown, but it's not tidy at all.Read More
This is the my first post after configuring my site to use github pages, and more specifically, after I started using Jekyll.Read More
It's pretty easy to setup box.net with webdav and to have your files automatically saved to box.net. I got 50gb of storage space on box.net when I bought an HP Touchpad 32gb on firesale.
However, because it's not a paid or professional account, I can't really do anything with it.
UPDATE: I don't use this setup because box.net is just too slow and some people have been complaining that they don't implement webdav properly. I really wish they'd come up with a client for Linux like Dropbox has.
Originally posted on Ubuntu Forums: http://ubuntuforums.org/showthread.php?t=202761&page=4Read More
It's easy to setup additional IP addresses on Debian Linux. This is particularly useful for the NSLU which doesn't have a display so you need to remotely connect to it.
Having more than one virtual network interface allows us to have both a DHCP address and a static IP address, making the NSLU2 accessible pretty much always.Read More
Sorry, no comments enabled for now because I'm using a simple static site generator (maybe I'll try out disqus another day).
But if you like what you see, send me an email.