When analyzing position-independent code (i.e. shellcode or malicious code snippets), you'll frequently see something like the following:
call is actually not a subroutine call but a disguised
push for the following data.
We can manually fix this by undefining the sub (otherwise IDA's auto-analysis will override our judgement), making it code again, jumping to the location after the call, turning it into string, fixing the following code:
There you can see it's just a
call-pop pair, effectively loading the string's address into
Now while this is better (although tedious to fix), switching to graph view will fail. Or rather turning the whole code back into a function will fail because of undefined instructions, and thus we also won't get our graph view back.
So not exactly a solution.
Modify the code
We know this is a
push in disguise. We can also freely modify the binary in IDA. So why not just make it a proper push!
To do that, we will perform the following steps:
- Add a segment with the same size as the current one
- Copy the inlined data to the new segment to the same offset
- Turn the
pushfollowed by a jump to the call's original location
- Some cosmetic stuff
All these can be done in a small script.
The reason for creating a same-sized segment is so we do not have to keep book. We just copy the inlined data to the same offset in the new storage segment instead. Not a nice solution for big targets but it works just fine for anything else, and not keeping state makes the script easy to use.
The script assumes two things:
- Your mouse is on the
callinstructions so you can quickly use the script bound to a hotkey
- We have manually created the
storagesegment. Could be done in the script of course.
The proof-of-concept script then is:
def get_storage_segment(): seg = get_first_seg() while seg != BADADDR and get_segm_name(seg) != "storage": seg = get_next_seg(seg) if seg != BADADDR: return seg else: return None def fix_call(): ea = get_screen_ea() if print_insn_mnem(ea) != 'call': print "Not a call instruction!" return # address of the trailing 'pop' instruction call_target = get_operand_value(ea,0) data_start = next_head(ea) data_len = call_target-data_start storage_addr = get_storage_segment() if not storage_addr: print "Error: Segment 'storage' not found" return # get offset in this segment offset = data_start - get_segm_attr(data_start,SEGATTR_START) copy_dest = storage_addr + offset for i in range(data_len): PatchByte(copy_dest+i,Byte(data_start+i)) ida_idp.assemble(ea,0,ea,True,"push 0%08xh" % copy_dest) ea += get_item_size(ea) ida_idp.assemble(ea,0,ea,True,"jmp 0%08xh" % call_target) ea += get_item_size(ea) # Undefine the inlined data to clean up the disassembly del_items(ea,DELIT_SIMPLE,call_target-ea) # Add a name to the copied data MakeName(copy_dest,"inlined_%08x" % data_start) idaapi.add_hotkey("2",fix_call)
It just does the above, calculate infos about the inlined data length and offset, addresses for the
jmp instruction we are going to patch in, copies the data, patches the instructions and performs a bit of cleanup.
Note that this may have issues with segmentation, I think I had some odd configuration where some API call returned a full address (with respect to the segment address) and some did not but I couldn't figure out what the constellation was when writing this article.
Always, always backup your
.idb before using modifying scripts like these. Even if the code performs as it does, you will find edge cases where it fails and ruin your database.
If we run the above script on the example and do minimal manual intervention (tell IDA that the
push is using an offset, and that the bytes following the
push are also code) we get this:
And eventually, we can turn this into a subroutine and switch to graph view: