Writing a Brainfuck Interpreter in JavaScript

Can JavaScript compete with Kattis solutions written in C?

Writing a Brainfuck Interpreter in JavaScript

A Kattis problem

Kattis is a website for judging various programming puzzles. Many of them come from problem sets in programming competitions. One such problem is BrainFsckVM from the German Collegiate Programming Contest (GCPC) 2012.

The problem from the competitive programming set requires you to implement a Brainfuck interpreter and detect infinite loops. Once a given Brainfuck program has executed 50 000 000 instructions, it can be assumed that it is in a loop, and the loop indices must be logged. If it does terminate, simply log "Terminates". The interpreter has 24 seconds to solve up to 20 such test cases.

Brainfuck in itself is a simple language with just 8 different instructions:

Character Instruction Performed
> Increment the data pointer by one (to point to the next cell to the right).
< Decrement the data pointer by one (to point to the next cell to the left).
+ Increment the byte at the data pointer by one.
- Decrement the byte at the data pointer by one.
. Output the byte at the data pointer.
, Accept one byte of input, storing its value in the byte at the data pointer.
[ If the byte at the data pointer is zero, then instead of moving the instruction pointer forward to the next command, jump it forward to the command after the matching ] command.
] If the byte at the data pointer is nonzero, then instead of moving the instruction pointer forward to the next command, jump it back to the command after the matching [ command.

Source: Wikipedia

Optimizing the JavaScript code

My first accepted solution simply does as stated. It naïvely runs the Brainfuck code and terminates after 50 000 000 instructions. But with a single optimization: A loop's start- and end positions are cached to easily return to the start or skip the loop:

const MAX_INSTRUCTIONS = 50_000_000;
const BYTE_SIZE = 2 ** 8;
const END_OF_INPUT_VALUE = 255;

let testCaseCount = +readline();

while (testCaseCount--) {
    let output = '';
    let [memorySize, programSize, inputSize] = readline().split(' ').map(Number);
    let program = readline();
    let input = readline();

    let inputPointer = 0;
    let memory = Array(memorySize).fill(0);
    let memoryPointer = 0;

    let loopMatchingCache = Array(program.length).fill(-1);
    let loopExecutionCount = Array(program.length).fill(0);
    let loopStartIndex = -1;
    let loopEndIndex = -1;
    let isInsideLoop = false;
    let loopTerminates = [];

    let instructionCount = 0;

    for (let pc = 0; pc < program.length && instructionCount <= MAX_INSTRUCTIONS; pc++) {
        const command = program[pc];
        instructionCount++;

        // Reset loop counters after closing bracket
        if (program[pc - 1] === ']') {
            loopExecutionCount[pc - 1] = 0;
            loopExecutionCount[loopMatchingCache[pc - 1] + 1] = 0;
            loopTerminates[pc - 1] = true;
            loopTerminates[loopMatchingCache[pc - 1] + 1] = true;
        }

        if (command === '>') {
            memoryPointer = (memoryPointer + 1) % memory.length;
        } else if (command === '<') {
            memoryPointer = (memoryPointer - 1 + memory.length) % memory.length;
        } else if (command === '+') {
            memory[memoryPointer] = (memory[memoryPointer] + 1) % BYTE_SIZE;
        } else if (command === '-') {
            memory[memoryPointer] = (memory[memoryPointer] - 1 + BYTE_SIZE) % BYTE_SIZE;
        } else if (command === '.') {
            output += String.fromCharCode(memory[memoryPointer]);
        } else if (command === ',') {
            memory[memoryPointer] =
                inputPointer < input.length ? input.charCodeAt(inputPointer++) : END_OF_INPUT_VALUE;
        } else if (command === '[') {
            if (memory[memoryPointer] === 0) {
                loopTerminates[pc] = true;
                let bracketDepth = 0;
                let startIndex = pc;

                if (loopMatchingCache[startIndex] !== -1) {
                    loopExecutionCount[pc] = 0;
                    pc = loopMatchingCache[startIndex];
                    loopExecutionCount[pc] = 0;
                } else {
                    pc++; // move past '['
                    while (program[pc] !== ']' || bracketDepth) {
                        if (program[pc] === '[') bracketDepth++;
                        else if (program[pc] === ']') bracketDepth--;
                        pc++;
                    }
                    loopMatchingCache[startIndex] = pc;
                }
            } else {
                isInsideLoop = true;
            }
        } else if (command === ']') {
            if (memory[memoryPointer] !== 0) {
                let bracketDepth = 0;
                let endIndex = pc;

                if (loopMatchingCache[endIndex] !== -1) {
                    pc = loopMatchingCache[endIndex];
                } else {
                    pc--; // move before ']'
                    while (program[pc] !== '[' || bracketDepth) {
                        if (program[pc] === ']') bracketDepth++;
                        else if (program[pc] === '[') bracketDepth--;
                        pc--;
                    }
                    pc--; // step back so the loop resumes correctly
                    loopMatchingCache[endIndex] = pc;
                }
            } else {
                isInsideLoop = false;
            }
        }
    }

    // Detect non-terminating loop boundaries if exceeded instruction limit
    for (let i = 0; i < program.length; i++) {
        if (program[i] === '[' && loopTerminates[i] !== true) {
            loopStartIndex = i;
            let innerDepth = 0;

            for (let j = i + 1; j < program.length; j++) {
                if (program[j] === '[') innerDepth++;
                if (program[j] === ']' && innerDepth === 0) {
                    loopEndIndex = j;
                    break;
                } else if (program[j] === ']') {
                    innerDepth--;
                }
            }
            break;
        }
    }

    console.log(
        instructionCount > MAX_INSTRUCTIONS
            ? `Loops ${loopStartIndex} ${loopEndIndex}`
            : 'Terminates'
    );
}

It comes in at 11.83 seconds runtime. That's alright, but we've got C++-based interpreters on the leaderboard doing <1.5 seconds. I bet JavaScript can do that too.

The first thing to do is to replace strings with numbers for faster comparisons, and to start using typed arrays for faster memory access:

let commands = '<>+-,.[]'
const LEFT = 0
const RIGHT = 1
CONST PLUS = 2
...
while (n--) {
  let o = ''
  let [memorySize, programSize, inputSize] = readline().split(' ').map(c=>+c)
  let program = new Uint8Array(readline().split('').map(c=>commands.indexOf(c)))
  ...
  let memory = new Uint8Array(sm)

By using typed arrays, the bounds of the array are fixed and the compiler may do more optimizations than what is possible on a dynamic, generic array. Additionally, there is no more hand-held modulo operation when writing to memory, as integers will simply overflow according to the Brainfuck specification. This simple change more than halves the runtime to 4.56 seconds

Optimizing Brainfuck

Getting down to 4.5 seconds was trivial, but if we want to get further down, we have to start optimizing not just the "runtime", but also the code being interpreted. So let's look at how we can optimize the Brainfuck code.

Skipping repetitions

One of the sample inputs is as follows:

+++++[->++<]>[-<+++++++>]<[->+>+>+>+<<<<]>+++>--->++++++++++>---<<<.>.>.>.

Those repeating plus-signs, especially in the loops, are being evaluated one by one. So one can count the repetitions of each symbol before running the program:

for (let i = program.length-1; i >=0; i--) {
  if (program[i] === curRepeatingInstruction) {
    curRepeatingInstructionCount++
  } else {
    curRepeatingInstruction = program[i]
    curRepeatingInstructionCount = 1
  }
  repetitions[i] = curRepeatingInstructionCount;
}

When executing the program, you can simply increment the pointer, the instruction count and the result of the instruction by that amount:

const command = program[pc];
instructionCount++
const repsPc = repetitions[pc]
if (command === LEFT) { // move pointer left
  memoryPointer = (memoryPointer + repsPc) % mem.length;
  pc += repsPc-1
  instructionCount += repsPc-1;
}

Pre-computing Instructions

A few other things that are quite easy to notice are, that instructions along the lines of +- and <>will cancel themselves out. It will be equivalent to replace them with noop (empty space). Additionally, a loop like [-] will set the memory at the current memory pointer to 0. So I replace simplified symbols with spaces to be able to keep a correct instruction count, and let the simple loops become a new symbol:

input.replaceAll(/\+\-|\-\+|<>|></g,'  ').replaceAll(/\[\-\]/g, ' z ').replaceAll(/\[\+\]/g, ' y '

The new symbols are quite straight-forward to implement:

let commands = '<>+-,.[] zy'
...
} else if (cur === Z) { // [-]
  instructionCount += memory[memoryPointer] + 1
  memory[memoryPointer] = 0;
  // Skip the last space
  pc++
} else if (cur === Y) { // [+]
  instructionCount += 257-memory[memoryPointer]
  // Skip the last space
  pc++
  memory[memoryPointer] = 0;
}

Detecting Infinite Loops

A simple infinite loop like [] never modifies any data. Instead of running up the instruction counter, I automatically detect whether a loop modifies anything, or contains an inner loop:

for (let i = 0; i < program.length; i++) {
  if (program[i] === 6) {
    let isInfinite = true;
    for (let j = i + 1; j < prog.length; j++) {
      if (program[j] !== SPACE && program[j] !== ENDLOOP && program[j] !== PRINT) {
        isInfinite = false;
        break;
      }
    }
    if (isInfinite) {
      knownInfiniteLoop[i] = true;
    }
  }
}

If an infinite loop is found, once it is reached in the Brainfuck-program, the program terminates and prints the loop position.

Giving Up Early

The final optimization, which is either a bug in my way of counting instructions, or the author's margin for error in the test cases, is to simply terminate after 25 000 000 instructions.

The Final Result

The final JavaScript code solves the problem set with a runtime of 1.04 seconds. Just 90 milliseconds shy of top 10, and way faster than the non-C solutions.

One Last Trick

If you decide to check out the leaderboard, you may see I've actually hit top 10 with JavaScript. The final optimization was a very questionable, "throw stuff at the wall and see what sticks"-attempt. Observing that up to 20 programs would be provided per test case, I guessed that some inputs sets could use the same Brainfuck code. By itself it didn't make much of a difference, because the memory size and inputs were probably different. But if I only used the Brainfuck code as a cache key, and (correctly) guessed that the output would be the same regardless of input and memory size, the submission still passed all test cases. It's trivial to create a test case that will break that optimization, so I haven't included it in this post.