Obfuscation

by Walter Rudametkin     Obfuscation  

Posted 2017.10.01  

This post is a general presentation about obfuscation and has been inspired by two sources: - A taxonomy of obfuscation transformations - A presentation of Pedro Fortuna, founder of JSCrambler

It addresses the following points: 1. What is obfuscation? 2. Why use obfuscation? 3. How to evaluate obfuscation? 4. What are the main transformations?

I wrote this post because I regularly have to look at obfuscated JS code and wanted to know more about the different obfuscation strategies. The second reason is that I am also interested in protecting sensitive JS code executed in the browser, and as we'll see in this post, obfuscation is often considered as a good candidate for this kind of problems. Thus, even though this article is about obfuscation in general, some parts may be more related to Javascript obfuscation in particular.

What is obfuscation?

Obfuscation is the process of transforming a program code into a form that's more difficult to understand or change. It must not be mistaken with minification which is simply the fact to decrease the size of a file. Although minification makes the file unreadable, it's extremly easy to give it a normal look, and then to understand or modify the code at ease.

More formally, obfuscation can be defined as the process of transforming a program P into a program P' such as P and P' have the same observable behavior. In particular, if the original program P doesn't terminate or terminate with an error, then its obfuscated version P' may or may not terminate. However, in case of an original program P that terminates, it's obfuscated version P' must returns the same value as P.

The notion of observable behavior is important. Indeed, strictly speaking, P and P' won't have the same exact behavior since as we will see in the next part, some obfuscation techniques consist in changing the execution flow of the program by reordering variables or functions for example. This notion of observable behavior means that P and P' may have different behaviors, but these different should not be perceived by the end user.

Why use obfuscation?

Obfuscation is used to protect code. Usually it is because you are distributing your code, for example in case of Javascript you send it to browsers that request a web page, but you don't necessarly want to expose your business logic contained in it. By obfuscating your program, you make it harder for someone to understand its behavior. Moreover, in case where someone would simply steal the code to use it on its website, it would make it more difficult to maintain and add new features.

Although encryption may seem more effective than encryption, it is of no interest if the code is decrypted on the client's machine since there will still be a moment where the client will be able to see the unencrypted version of the code. However, an other solution to protect the code is to move the critical logic that needs to be protected to the server side. By providing an API that makes remote calls to the server containing the critical logic, the client has no longer access to the code. This solutions adds overhead both for the client since it is dependent of network conditions, and also for the company that provides the API since with this solution it needs to provides reliable servers, whereas with obfuscation, code is executed on the client device.

Evaluate obfuscation

This part summarizes how to characterize the efficiency of obfuscation, as well as its impact on the program that has been obfuscated. The evaluation process is split in three components that respectively answer to the following questions: * How much obscurity has been added to the program? (human) * How difficult is it to automatize the deobfuscation process? (machine) * How much computational overhead is being added by the obfuscation

Potency

Potency aims at measuring how the obfuscation process makes it more difficult for a human to understand the obfuscated program. In order to measure if a new obfuscated programm P' is less readable than the original program P, we use metrics from software engineering. These metrics reflect the quality, readability and maintainability of a program. Traditionnaly, the goal is to optimize these metrics so that a program become easier to maintain and read. In case of obfuscation, the goal is to worsen these metrics in the opposite direction so that the program becomes more obscure. For example, when usually one would try to minimize the program's size or variable dependencies, in case of obfuscation the goal is to maximize them.

Resilience

Resilience aims at measuring how well an obfuscated program can resist to a deobfuscator. In contrary to potency which measures the complexity for a human, resilience measures the complexity added for a machine. It can be seen as the time required for a deobfuscator to reduce the potency of the obfuscated program, i.e. the time needed to make it more readable by a human.

Cost

Finally, the cost measures the overhead added by the obfuscation to the application. The overhead may be both in term of size of the file and in term of execution time. Indeed, in case of Javascript where files must be sent through the internet to browsers when they request a page, it might be a problem if the obfuscated program is too big.

Thus, the global quality of an obfuscation can be seen as a function of the potency (confusing a human), resilience (confusing a deobfuscator), and the overhead it introduces (execution time and size of the files).

Main transformations

In order to obfuscate a program, an obfuscator applies different kinds of transformations on the original program. We present these transformations in three categories depending on their target.

Layout

Layout transformations change the visual representation of the program. They are one way since once the transformation has been made, the previous state can't be recovered. In this category we find simple transformations such as renaming variables, functions, or removing comments. By doing this kind of transformation, it removes the semantic contained in the name of variables, or the indications present in the comments. We also find operations such as changing the code formatting. It generally consists in removing spaces and lines, which is one the reason why minification is often mistaken with obfuscation.

We give an example on a simple program:

// Compute the sum of 2 numbers
function sum(number1, number2) {
    return number1 + number2
}

The original program could be transformed in this new program below:

function sUSNO0(qsqzu_Pmj, azsd_hFZh){return qsqzu_Pmj+azsd_hFZh;}

Control transformations

Control transformations aims at obsuring the control flow of the application. These transformations are separated in three subcategories presented hereafter.

Aggregation transformations

Aggregation transformations consist in separating computations that are logically together, or merge code that are not. This is based on the idea that code that has been aggregated in a function or a class by a programmer must probably be linked. By separating it into different functions or classes, it makes it more difficult to understand. On the opposite, merging unrelated blocks of code together makes it look like they have a semantic link.

Ordering transformations

Ordering transformations randomize the order of instruction execution. It relies on the idea that spatial locality in the code, i.e. functions or statements close in the code, plays an important role in making the code more understandeable.

Computation transformations

Computation transformations insert dead, redundant code, or make algorithmic changes. Their goal is to hide the real control flow by polluting it with irrelevant statements. In contrary to transformations that target the visual representation of the program, this set of transformations may have an impact on the execution time of the program (cost metric).

Example below presents how we could transform the previous sum function by adding dead code and altering the control flow, without changing its final result.

function sum(number1, number2) {
    var a = 42;
    var z ;
    var res;
    if(number1 < 753) {
        z = 890;
        res = number1 + z;
    } else {
        z = 56 + a;
        res = number1 + z;
    }
    return res+number2 - z;
}

Data

A second target of transformations are data structures. They are classified in three categories:

Storage and encoding transformations

This strategy relies on using a non "natural" way to encode or store data. For example, we can replace a boolean variable by two variables and a mapping function used to reconstruct the original value. If a variable v = false, we can represent it with the tuple (false, false, AND), (false, true, AND), or (true, false, AND) where AND is the && operator). By using this tuple of 3 elements, we can recompute the final value. Besides boolean variables, this technique can be generalized to other type of variables.

Example below illustrate how we can split boolean variables into a tuple of 3 elements:

function evalBool(v1, v2) {
    return v1 && v2;
}

// Instead of storing directly the value in a boolean, we split it in 2 values and a function
if(evalBool(true, false)) {
    // do things
}

An other technique often used on string is to not use the raw string directly, but instead convert the string into a program that produces the string.

Aggregation transformations

Aggregation transformation aim at aggregating data structures to hide their original representation. For example, it may operate on arrays by restructuring them. An array can be split into multiple sub arrays, multiple arrays can be merged into one. A one dimensional array can be folded into a higher dimensional array, or the opposite, called flattening, which is the process of decreasing the dimension of an array. For example, by representing a 2D grid using a 1D array, it makes it more difficult to understand the purpose of the variable.

Besides arrays, these transformations may also focus on inheritance relationships. It may increase the level of inheritance, or on the opposite decrease it by merging subclasses.

Example below presents how to flatten a 2D array to a 1D array:

function getNewIndex(i, j) {
    return i + (i+1)*j;
}

var grid1D = new Array(m*n);

for(var i = 0; i < m; i++) {
    for(var j = 0; j < n; j++) {
        grid1D[getNewIndex(i, j)] = grid2D[i, j];
    }
}

Ordering transformations

Ordering transformations randomize the order of declarations in the source code. In particular, the order of methods and instances variables of a class, as well as parameters in functions. It is also possible to randomize the order in which elements are stored in a data by providing a function that given an index i, maps it back to its original position in the array.

Table below, taken from A taxonomy of obfuscation transformations, provides an overview of the different obfucating transformations as well as their impact on the potency, the resilience and the cost. A "+" character indicates that the value of the metric depends on the context.

Obuscation Quality
Target Operation Transformation Potency Resilience Cost
Layout Scramble identifiers medium one way free
Change Formatting low one way free
Remove Comments high one way free
Control Computations Insert dead code Depends on the quality of the opaque predicate and
the nesting depth at which the construct is inserted.
Extend loop condition
Reducible to non reducible
Add redundant operands
Remove programming idioms medium strong +
Table interpretation high strong costly
Parallelize code high strong costly
Aggregation Inline method medium one way free
Outline statements medium strong free
Interleave methods Depends on the quality of the opaque predicate.
Clone methods
Block loop low weak free
Unroll loop low weak cheap
Loop fission low weak free
Ordering Reorder statements low one way free
Reorder loops low one way free
Reorder expressions low one way free
Data Storage and encoding Change encoding Depends on the complexity of the encoding function.
Promote scalar to object low strong free
Change variable lifetime low strong free
Split variable Depends on the number of variables into which the
original variable is split.
Convert static to procedural data Depends on the complexity of the generated function.
Aggregation Merge scalar variables low weak free
Factor class medium + free
Insert bogus class medium + free
Refactor class medium + free
Split array + weak free
Merge arrays + weak free
Fold array + weak cheap
Flatten array + weak free
Ordering Reorder methods and instance variables low one way free
Reorder arrays low weak free

Some transformations present in the table have not been presented in this post, but their name is rather explicit about their purpose. This post also didn't tackle the complexity of creating efficient opaque predicates capable of resisting to deobfuscators. These different points may be treated in future posts on obfuscation. Meanwhile, you can refer to this article for more details.

Read More

Not enough threads or processes : "thread create failed"

by Walter Rudametkin     linux   bug  

Posted 2014.04.10 — Rennes, France

I started running into a weird problem that I didn't immediately identify. Some programs just started failing: libreoffice, chrome, chromium, firefox, eclipse, ...

It was quite undeterministic and depended on the system ressources being fairly well used. I thought it was a RAM issue, not having enough memory would cause programs to fail. I have somewhat agressive ram settings, but, I also have 16 GB of RAM on my computer. Well, it wasn't a memory issue, I was able to reproduce the issue with loads of memory still left over.

Here's some of the messages I was getting.

Libreoffice: (similar bug here)

osl::Thread::create failed

Java:

java.lang.OutOfMemoryError: unable to create new native thread.

Chrome and Chromium:

pthread_create error: Resource temporarily unavailable
Read More

Make chromium faster with video acceleration

by Walter Rudametkin     linux   chromium   feature  

Posted 2014.04.07 — Rennes, France

I tried this on my Dell e6530 and it seems to work pretty good.

http://www.borfast.com/blog/how-enable-webgl-google-chrome-linux-blacklisted-graphics-card

For video acceleration : Go to chrome://flagsOverride software rendering list → Set to Enable

This works for both Chrome and Chromium.

Read More

Add Chrome PDF Viewer to Chromium on Fedora 20 x86_64

by Walter Rudametkin     linux   chromium   feature  

Posted 2014.04.02 — Rennes, France

I really would like a PDF reader that saves the open pdf files in tabs, like a browser does. It would let me get back to reading whatever I was reading after a restart. Currently, I try to restart as little as possible, usually between 20 and 40 days in order to save all my open stuff. I would use Mendeley except the open tabs feature request has been open since 2009 and not implemented yet!!! http://feedback.mendeley.com/forums/4941-general/suggestions/263198-remember-open-tabs-and-position-within-pdfs.

My answer for PDFs is to use my browser to store my open files. It might be a little overkill but it does the job decent enough. I mainly use Firefox + mozplugger + evince, but I also use the Chrome and Chromium browsers for different things including reading PDFs. Chrome has a simple PDF plugin that I like and its pretty fast, but Chromium doesn't have it because of licensing issues.

Read More

Reduce touchpad sensitivity on a Dell e6430

by Walter Rudametkin     linux   feature  

Posted 2014.03.23 — Rennes, France

I find the touchpad on my laptop way to sensitive by default. To change it, I've found the following settings comfortable:

xinput set-prop "AlpsPS/2 ALPS DualPoint TouchPad" "Synaptics Finger" 18 18 18

I also have a wireless mouse that I sometimes use. It's way to sensitive. The following command makes it useable:

xinput set-prop "HP Wireless Optical Mobile Mouse" "Device Accel Adaptive Deceleration" 1.5
Read More

Acrobat reader cannot find libEGL.so.1 : Fedora 20

by Walter Rudametkin     linux   bug  

Posted 2014.03.19 — Rennes, France

I use acrobat reader sometimes, like when people send me an annotated pdf or if Evince doesn't print it properly. My main pdf reader is Evince, but it's a bit buggy. Also, I use okular to annotate pdfs, which it does a fine job at.

Anyhow, I was getting the following error after applying some updates:

acroread: error while loading shared libraries: libEGL.so.1: cannot open shared object file: No such file or directory

It took me a while to track down the problem, which has a simple solution, just find the missing symbolic link. I ran into this page that describes the issue but nobody had provided the proper command http://forums.fedoraforum.org/showthread.php?t=297151.

Read More

Fr alternative keyboard layout bug (fr-oss): right ctrl key not working

by Walter Rudametkin     linux   bug  

Posted 2014.03.12 — Rennes, France

I really like the 'alternative' french keyboard layout on my computer, a.k.a fr-oss. I like to use the deadkeys for accents in french and spanish, and I really like the alt-gr + ctrl# for doing arrows and stuff like that.

Earlier this year on my Fedora laptop (F20) for work the right ctrl key stopped working. It sucked because ctrl-arrows no longer jump through words in text documents. VLC right ctrl+Up/Down no longer changed the volume either, making me have to use both hands just for that. I figured it was a bug in Fedora so I found a neat command to fix the mapping while I waited for the problem to be fixed. However, some time later it happened to my wife's Ubuntu laptop and looking into it I found that the problem is comes from higher up the chain.

Someone decided to map right ctrl to something "new" instead of letting it be the same as the left ctrl key. Sure, if programs used the "new right control" it might be great and super useful. But they don't, so the key does nothing. Kind of beats the purpose, huh?

https://bugs.freedesktop.org/show_bug.cgi?id=15804#c44

To fix the problem, here's the command you can run.

xmodmap -e 'keycode 105 = Control_R' -e 'clear Control' -e 'add Control = Control_L Control_R'

However...

Read More

My father tortures me with beautiful, sunny pictures while I'm stuck in the rain

by Walter Rudametkin     life   fishing  

Posted 2014.03.07 — Rennes, France and Ensenada, Mexico

My Dad is a wonderful person and loves his children a lot, but as of late, he's decided to torture me with pictures from the beautiful Pacific coast of Mexico. Anybody who's been to California knows just how much sun there is. All. Year. Round.

Baja California is no different from California in that respect (well, it's even better IMHO).

That's all nice and all but I haven't seen the sun since fall started, back in October 2013 (wooooo, so long ago). In Rennes, France−the region of Brittany−we get the same shitty weather that the english get, give or take a less-cloudy day or two. So, how do you think I feel when Dad sends me these beautiful pictures? It makes me wanna cry. I feel nostalgic thinking of all of our fishing trips. I miss the food. I miss the sun. I miss the people. I miss a whole lot of things.

For example, here's a picture from last December.

Dad and a linkcod in the Bay of Ensenada

Read More

Thunderbird bug: leading spaces are removed when character set is automatically converted

by Walter Rudametkin     thunderbird   bug  

Posted 2014.03.05 — Rennes, France

I ran into a weird bug that took me about 50 test emails to understand. Thunderbird appears to automatically convert your plain-text emails from one character set to another automatically if it detects that there is an incompatible character.

However, the conversion isn't as smooth as it should be. In fact, in my case, all leading spaces in my neatly formatted plain-text format=flowed emails disappear.

Read More

Testing different markup options

by Walter Rudametkin     markdown   untidy  

Posted 2014.02.06 — Rennes, France

This post is a test for markdown and Jekyll. I basically have all kinds of examples of how to do stuff in markdown, but it's not tidy at all.

Read More

First post using jekyll

by Walter Rudametkin     first-post   jekyll   github-pages  

Posted 2014.02.04  

This is the my first post after configuring my site to use github pages, and more specifically, after I started using Jekyll.

Read More

Mount your Box.Net account using WebDAV

by Walter Rudametkin     linux   webdav   cloud  

Posted 2011.09.17 — Grenoble, France

It's pretty easy to setup box.net with webdav and to have your files automatically saved to box.net. I got 50gb of storage space on box.net when I bought an HP Touchpad 32gb on firesale.

However, because it's not a paid or professional account, I can't really do anything with it.

UPDATE: I don't use this setup because box.net is just too slow and some people have been complaining that they don't implement webdav properly. I really wish they'd come up with a client for Linux like Dropbox has.

Originally posted on Ubuntu Forums: http://ubuntuforums.org/showthread.php?t=202761&page=4

Read More

NSLU2 Linux Gateway

by Walter Rudametkin     linux   nslu2   work-in-progress  

Posted 2010.07.29 — Rennes, France
Work in progress
NSLU2.jpg

It's easy to setup additional IP addresses on Debian Linux. This is particularly useful for the NSLU which doesn't have a display so you need to remotely connect to it.

Having more than one virtual network interface allows us to have both a DHCP address and a static IP address, making the NSLU2 accessible pretty much always.

Read More

'Tis me blog matie

Sorry, no comments enabled for now because I'm using a simple static site generator (maybe I'll try out disqus another day).

But if you like what you see, send me an email.