skip to content
sumr.dk

Writing a bad programming langauge in rust

/ 6 min read

Writing a bad programming language in rust

”Backstory”

I was doing a “very hard” reversing challenge for DDC and figured i’d try writing my own “programming language”.

Sorry for all the quotation marks, nothing of this is supposed to be super serious.

I recently tried rust and thought it was cool and wanted to write some code with it. Perfect storm.

What was the challenge

You get zip with the binary and this python

Python code:

import subprocess
import time
import random
import math
PATH = "./marlang"
input_string = input("code> ") + "\n"
for i in range(1000):
a = random.randint(1,100000)
b = random.randint(1,100000)
process = subprocess.Popen(PATH, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
process.stdin.write(input_string)
time.sleep(0.001)
process.stdin.write(f"{a}\n")
time.sleep(0.001)
process.stdin.write(f"{b}\n")
stdout, stderr = process.communicate()
if math.gcd(a,b) != int(stdout): exit()
print("DDC{some flag}")

How to write a dumb language (for eel this time)

Steps:

  • Find a way to parse input fx instruction<space>char<space>char so your computer knows what the text means in terms of structs // objects
  • Write what each instruction does with the data on some allocated memory
  • Step through an array of instructions and data
  • Know you have dumb language

I will now dump the source so you can see for yourself :)

use std::io;
#[derive(Debug, PartialEq)]
enum Value {
I32(i32),
Address(Box<Value>),
}
#[derive(Debug)]
enum OpCode {
AddP,
SubP,
Add(Value),
Sub(Value),
Mult(Value),
Div(Value, Value),
Mov(Value),
JNE(Value),
JMP(Value),
Cmp(Value, Value),
CLR(Value),
Loop(Value, Value, Value),
Read,
Print,
Exit,
}
fn value_parser(raw: String) -> Value {
match raw {
_ if raw.starts_with("&") => {
Value::Address(Box::new(Value::I32(raw[1..].parse::<i32>().unwrap())))
}
_ if raw.parse::<i32>().is_ok() => {
Value::I32(raw.parse::<i32>().unwrap())
}
_ => Value::I32(0)
}
}
fn get_value(value: &Value, mem: [i32; 50]) -> i32 {
if let Value::I32(val) = *value {
return val
}
if let Value::Address(boxed_value) = value {
if let Value::I32(val) = **boxed_value {
return mem[val as usize]
}
}
0
}
fn lexer(raw: String) {
let removed_nl = raw.strip_suffix("\n");
let symbols: Vec<&str> = removed_nl.unwrap().split(" ").collect();
let mut operations = Vec::<OpCode>::new();
for symbol in symbols {
let op = match symbol {
"->" => Some(OpCode::AddP),
"<-" => Some(OpCode::SubP),
"r" => Some(OpCode::Read),
"p" => Some(OpCode::Print),
"#" => Some(OpCode::Exit),
_ if symbol.starts_with("+") => {
Some(OpCode::Add(value_parser(symbol[1..].to_string())))
}
_ if symbol.starts_with("-") => {
Some(OpCode::Sub(value_parser(symbol[1..].to_string())))
}
_ if symbol.starts_with("*") => {
Some(OpCode::Mult(value_parser(symbol[1..].to_string())))
}
_ if symbol.starts_with("/") => {
let split: Vec<&str> = symbol.split(",").collect();
let a = value_parser(split[0][1..].to_string());
let b = value_parser(split[1].to_string());
Some(OpCode::Div(a,b))
}
_ if symbol.starts_with("=") => {
let split: Vec<&str> = symbol.split(",").collect();
let a = value_parser(split[0][1..].to_string());
let b = value_parser(split[1].to_string());
Some(OpCode::Cmp(a,b))
}
_ if symbol.starts_with("!") => {
Some(OpCode::JNE(value_parser(symbol[1..].to_string())))
}
_ if symbol.starts_with("j") => {
Some(OpCode::JMP(value_parser(symbol[1..].to_string())))
}
_ if symbol.starts_with(".") => {
Some(OpCode::Mov(value_parser(symbol[1..].to_string())))
}
_ if symbol.starts_with("^") => {
Some(OpCode::CLR(value_parser(symbol[1..].to_string())))
}
_ if symbol.starts_with(":") => {
let split: Vec<&str> = symbol.split(",").collect();
if split.len() != 3 { continue }
let instruction_count = value_parser(split[0][1..].to_string());
let loops = value_parser(split[1].to_string());
let address = value_parser(split[2].to_string());
Some(OpCode::Loop(instruction_count, loops, address))
}
_ => None
};
match op {
Some(op) => operations.push(op),
None => ()
}
}
run(operations);
}
fn run(opcodes: Vec<OpCode>) {
let mut pointer = 0;
let mut mem: [i32; 50] = [0; 50];
let mut index = 0;
let mut cmp_flag = false;
while index< opcodes.len() {
let opcode: &OpCode = &opcodes[index];
match opcode {
OpCode::Add(value) => {
let val = get_value(value, mem);
mem[pointer] += val;
}
OpCode::Sub(value) => {
let val = get_value(value, mem);
mem[pointer] -= val;
}
OpCode::Mult(value) => {
let val = get_value(value, mem);
mem[pointer] *= val;
}
OpCode::Div(a, b) => {
let a = get_value(a, mem);
let b = get_value(b, mem);
let remainder = mem[pointer]%a;
mem[pointer] /= a;
mem[b as usize] = remainder;
}
OpCode::Cmp(a, b) => {
let a = get_value(a, mem);
let b = get_value(b, mem);
cmp_flag = a==b
}
OpCode::JMP(value) => {
let val = get_value(value, mem);
index = val as usize;
}
OpCode::JNE(value) => {
let val = get_value(value, mem);
if !cmp_flag {
index = val as usize;
continue;
}
}
OpCode::Mov(value) => {
let val = get_value(value, mem);
mem[pointer] = val;
}
OpCode::CLR(value) => {
let val = get_value(value, mem);
mem[val as usize] = 0;
}
OpCode::AddP => pointer+=1,
OpCode::SubP => pointer-=1,
OpCode::Read => {
let mut buffer = String::new();
let stdin = io::stdin();
let _ = stdin.read_line(&mut buffer);
buffer = buffer.strip_suffix("\n").unwrap().to_string();
let result =buffer.parse::<i32>().unwrap();
mem[pointer] = result
}
OpCode::Print => println!("{}", mem[pointer]),
OpCode::Exit => break,
OpCode::Loop(instr, num , address) => {
let instr_parsed = get_value(instr, mem);
let max = get_value(num, mem);
let address = get_value(address, mem);
if mem[address as usize] != max {
mem[address as usize]+=1;
index-=instr_parsed as usize;
continue;
}
}
}
index+=1;
}
}
fn main() -> io::Result<()> {
let mut buffer = String::new();
let stdin = io::stdin();
stdin.read_line(&mut buffer)?;
lexer(buffer);
Ok(())
}

TLDR - pattern matching OP

Solving it

Basically just write the GCD algorithm in the stupid language.

A simple python solution for GCD could be this

def hcfnaive(a, b):
if(b == 0):
return abs(a)
else:
return hcfnaive(b, a % b)

Not super fast, but very short and concise - neat.

Reversing rust sucks and takes lots of time. Since I made the source - this would be silly for me to act like i havent seen it before. I probably could write the whole thing up, but I dont think it would be worth it.

But retyping everything or stepping through it in GDB with some guesswork about what each instruction does would probably give you a pretty nice result.

How the langauge works

The program has a list of instructions, and a “virtual memory”.

A simple program could look like this

r -> r <- p -> p

This code reads input, increments mempointer, reads input again, goes back to the 0th mem pointer, then prints, then goes forward again, then prints.

it just werks.

Here is a list of all the instructions and what they do.

  • r: reads an integer to current memory pointer
  • p: prints integer in current memory pointer
  • -> increments memory pointer
  • <- decrements memory pointer
  • # exits
  • +<num>adds number to current value
  • -<num>j subtracts number to current value
  • \*<num> multiplies number to current value
  • /<num>,<num> divides number with current value and stores remainder in address of second param
  • =<num> compares current value with number and stores result in the cmp_flag
  • j<num> jumps to that index of instructions so in the above example. ‘j0’ would jump to the first read
  • !<num> jump not equal. if the cmp flag is false, jump like above example
  • .<num> move. Moves the value of the number specified into the current value.
  • ^ clear. sets current value to 0
  • :<instruction>,<times_to_loop>,<store_increment>. Loop. example is easier here. :2,3,4 jumps 2 instructions back and increments address 4 one time. if address 4 has a value of 3 then dont do it anymore.

How to do GCD then?

r -> r =&1,0 !8 <- p # -> .&1 -> .&0 /&2,3 <- <- <- .&2 -> .&3 j2

Description

read and move mem pointer
check if &1 = 0
jne to gcd
move pointer to mem0
print
exit
label gcd
mov address of mem0 to curr addr
div address of mem2 to curr addr and put remainder in mem3
go back to mem0
mov new values to mem0 and mem1
jump to instruction 2 (0 indexing)

Now you can write marlang. Hope you will use this dumb langauge for another stupid esoteric CTF challenge wink wink.