Date: March 25, 2023 rank: 60 team: ST4RT

fishing - rev ( 166 pt, 32 solve )

In _main function, calls _do_global_ctors and functions at 0x140004100 are called in turn

_main 함수에서 _do_global_ctors 함수를 호출 한 뒤, 0x140004100 에 있는 함수들이 차례로 호출된다.

those function can’t be decompiled due to anti disassemble jumping to opcode+1 . so I wrote idapython script to fix that anti disassemble

해당 함수들을 디컴파일해보면,opcode+1 로 점프해 디스어셈블러가 망가지는 것을 확인할 수 있다. idapython 으로 저렇게 생긴 부분들 정상적으로 디컴할 수 있게하는 스크립트 작성

  
import re
import time
functions = [(0x140001AE6,0x140001B12),(0x140001B13,0x140001ba3),(0x140001BA4,0x140001cde)]
functions = [(0x1400020FA,0x14000222F)]
functions = [(0x1400017d0,0x140001ae5)]
pattern = r"jmp.+(14[\\da-fA-F]{7}\\+1)"
for f in functions:
    start,end=f
    print('start',hex(start))
    while start < end:
        create_insn(start)
        line = generate_disasm_line(start,0)
        print(line)
        if re.search(pattern,line):
            print(line)
            patch_byte(start,0x90)
            create_insn(start)
            start+=2
        if('db' in line):
            patch_byte(start,0x90)
            create_insn(start)
            start+=1
        else:
            start+=1
    print('end',hex(end))

print(time.time())
print('+'*30)

0x140001AE6 -> call AddVectoredExceptionHandler, handler → 0x1400017D0

0x140001B13 -> get gs register to detect debugger

0x140001BA4 -> get gs register, GetProcessHeap to detect debugger

checking Xref to _main, sub_1400020FA calls _main

after calling the _main function, it gets input and creates a thread with input as an argument and call SetThreadContext to set Dr0~3 register

sub_1400020FA 함수에서 _main 함수를 호출한 뒤, 문자열을 입력받고 해당 문자열을 인자로 스레드를 생성하는 루틴을 볼 수 있었다. 스레드를 생성하고, ThreadContext 설정

  
Context.Dr0 -> xor 0x21 func
Context.Dr1 -> sub 0x22 func
Context.Dr2 -> xor 0x11 func
Context.Dr3 -> add 0x12 func

Thread Handler

call xor 0x21 with input
call sub 0x22 with input
call xor 0x11 with key
call add 0x12 with key
custom RC4 encryption
compare

Since there is only exit after detecting debugger, I can get RC4 key, patching exit to ret.

디버거를 단순 탐지 후 exit 하는 것 밖에 없기 때문에, exit 패치하고 4 까지 가면, 사용되는 키 값을 알 수 있다.

key : m4g1KaRp_ON_7H3_Hook

https://ling.re/hardware-breakpoints/

But when I xor, sub, RC4 my input, it doesn’t match to values in debugger

그런데, thread handler 루틴 그대로 해도 input 을 암호화 한 결과가 디버깅할 때 값이랑 일치하지 않는다.

it’s due to AddVectoredExceptionHandler, when the functions in Dr0~3 are called, the handler is executed first, then the function is executed

이것은 위에서 AddVectoredExceptionHandler 으로 Debug Register 핸들러를 등록하고, Dr 레지스터를 설정해서 다른 루틴이 추가로 실행되기 때문이다.

  
__int64 __fastcall sub_1400017D0(struct _EXCEPTION_POINTERS *ExceptionInfo)
{
...
  if ( ExceptionInfo->ExceptionRecord->ExceptionCode != 0x80000004 )
    return 0i64;
  if ( (ExceptionInfo->ContextRecord->Dr6 & 1) != 0 )
  {
    Rcx = ExceptionInfo->ContextRecord->Rcx;
    Rdx = ExceptionInfo->ContextRecord->Rdx;
    for ( i = 0; i < Rdx; ++i )
      *(i + Rcx) ^= i;
  }
  else if ( (ExceptionInfo->ContextRecord->Dr6 & 2) != 0 )
  {
    v5 = ExceptionInfo->ContextRecord->Rcx;
    v4 = ExceptionInfo->ContextRecord->Rdx;
    for ( j = 0; j < v4; ++j )
      *(j + v5) = (*(j + v5) >> 5) | (8 * *(j + v5));
  }
  else if ( (ExceptionInfo->ContextRecord->Dr6 & 4) != 0 )
  {
    v12 = 105;
    v7 = ExceptionInfo->ContextRecord->Rcx;
    v6 = ExceptionInfo->ContextRecord->Rdx;
    for ( k = 0; k < v6; ++k )
    {
      *(k + v7) ^= v12;
      v12 *= v12;
    }
  }
  else if ( (ExceptionInfo->ContextRecord->Dr6 & 8) != 0 )
  {
    v9 = ExceptionInfo->ContextRecord->Rcx;
    v8 = ExceptionInfo->ContextRecord->Rdx;
    for ( m = 0; m < v8; ++m )
    {
      *(m + v9) = 2 * *(m + v9) + 3 * m;
      *(m + v9) ^= m + 5;
    }
  }
  ExceptionInfo->ContextRecord->Dr6 = 0i64;
  ExceptionInfo->ContextRecord->EFlags |= 0x10000u;
  return 0xFFFFFFFFi64;
}

Which BP is affected can be known according to each bit of Dr6.

  
Context.Dr0 -> xor 0x21 -> xor idx
Context.Dr1 -> sub 0x22 -> ror 5
Context.Dr2 -> xor 0x11 -> v12=105, xor V with v12, v12 ** 2
Context.Dr3 -> add 0x12 -> ( V * 2 + 3 * idx ) ^ ( idx + 5 )

When I debug, some handlers didn’t work. So the input encryption routine is finally :

디버깅 해보니까, input 암호화 할 때 작동하지 않는 핸들러도 있었다. 그래서 input 암호화 루틴은 최종적으로 다음과 같다.

xor with 0x21
sub 0x22 handler
sub with 0x22
custom RC4 with key “m4g1KaRp_ON_7H3_Hook”

solve.py

  
key=b"m4g1KaRp_ON_7H3_Hook"
flag=list(map(lambda x:int(x,16),"D0 BE 9F 5A BD F0 34 B5 D0 6F FB E2 99 BA AE D7 36 D5 2D C2 22 45 B0 03 9D 63 66 53 C7 28 CC 2A 2B 14 BB 09 9B E3 60 46 3A 00 00 00 00 00 00 00".split()))

## RC4
def PSK(t,key):
    for i in range(256):
        t[i] = i
    v6=0
    for j in range(256):
        v6 = (t[j]+v6+key[j%len(key)])&0xff
        v4 = t[j]
        t[j]=t[v6]
        t[v6]=v4

def enc(v7,data,out):
    v12 = 0
    v11 = 0
    for i in range(0x28):
        v12 = v12+1
        v11 = (v7[v12]+v11)&0xff
        v9 = v7[v12]
        v7[v12]=v7[v11]
        v7[v11]=v9
        v8 = v7[(v7[v12]+v7[v11])&0xff]
        out[i] = ((v11-24)^v8^data[i])&0xff

def xor_21(buf):
    for i in range(len(buf)):
        buf[i] ^= i
def sub_22(buf):
    for i in range(len(buf)):
        ## ror 5
        # buf[i] = ((buf[i]<<3)&0xff) | ((buf[i]>>5)&0x07)
        ## rol 5
        buf[i] = ((buf[i]<<5)&0xff) | ((buf[i]>>3)&0x1f)
def xor_11(buf):
    v12 = 105
    for i in range(len(buf)):
        buf[i] ^= v12
        v12 = v12**2 & 0xff
def add_12(buf):
    for i in range(len(buf)):
        buf[i] = 2 * buf[i] + 3 * i
        buf[i] &=0xff
        buf[i] ^= i+5

table=[0]*256
PSK(table,key)
out=[0]*0x28
enc(table,flag,out)
out = list(map(lambda x:x+0x22&0xff,out))
sub_22(out)
out = list(map(lambda x:x^0x21,out))

print(bytes(out))
## LINECTF{e255cda25f1a8a634b31458d2ec405b6}

(CTF) LINE CTF 2023 fishing writeup

fishing - rev ( 166 pt, 32 solve )

Thread Handler

solve.py

Further Reading

(CTF) SEETF CTF 2023 NOW, Murky SEEPass writeup

(CTF) Google CTF 2023 writeup (oldschool, ubf)

(CTF) corCTF 2023 writeup (diophantus,babyWallet,utterly-tangled)