How future languages will prevent Log4Shell

Java

Coding Dec 22, 2021

The Log4Shell exploit makes conceptual problems in conventional programming languages apparent. Third-party code is able to execute arbitrary side-effects. Modern programming languages introduce concepts to avoid such vulnerabilities by design.

Log4Shell in a nutshell 🔗

The Log4Shell vulnerability boils down to a log statement doing more than just logging.

public class MyClass {
    private static final Logger LOG = LogManager.getLogger("MyClass");

    public static void main(String[] args) {
        LOG.info(args[0]);
    }
}

On execution of such a program with java MyProgram abc, you'll expect the program to print abc. And for the majority of inputs, it will do exactly that. However, the vulnerability in Log4J evaluated certain inputs in a way that untrusted Java code was downloaded and executed. There are a lot more details to Log4Shell which other posts excellently explain.

The conceptual issue 🔗

The underlying issue that led to this vulnerability already starts with the log statement LOG.info() being able to do something else than logging. In Java, every method void f() {} is able to execute any side-effect. It can read and write on the file system, it can log text on stdout, it can download resources from the internet and execute them. The language doesn't restrict a method in its side-effects and it does not give any indication what side-effect a method is executing.

The conceptual solution 🔗

There are some newer programming languages that address this conceptual issue. One of these languages is Koka. The fundamental idea is encoding the side-effects of each function in its type signature.

fun multiplyAndPrint(x: int, y: int): console int {
    result := x * y
    println(result)
    return result
}

The function multiplyAndPrint specifies that it returns an integer but also that it accesses the console. If the function does not specify that it access the console, then the compiler rejects the program with errors.

fun multiply(x: int, y: int): int {
    result := x * y
    println(result) // <-- Error
    return x*y
}

The predefined function println defines that it uses console but multiply does specify that it does not use console, hence the compiler rejects the program.

fun println(s: string): console () {
    ...
}

If a function specifies only the usage of console, then it also means it can not access the network, it can not download code and can not execute it.

Conclusion 🔗

The presented language concept can prevent third-party code from executing unwanted side-effects. While a logging library could still define a log function to download or execute code, this becomes transparent if expressed in the type signature and enables consumers to make a decision on if giving these permissions is a worthy trade-off.

Log4Shell in a nutshell 🔗

The conceptual issue 🔗

The conceptual solution 🔗

Conclusion 🔗

Further reading 🔗