• C Programming - Var Args trouble.
    5 replies, posted
I am in the midst of a large project, and I am running into some weird issues. Heres the deal. I am messing about with making an operating system. I am by no means claiming to be good at it, or saying it will be 1/100th of what real operating systems can do. Disclaimer now being said, here we go. For this project, I decided I would also write my own libc and libk to be used. I thought it would offer me a steep yet rewarding experience to get myself more familiar with c programming in general. So I decided I would write a printf( const char *, ... ) function to print data. And so the headaches began. Here is the code I am testing with: int printf( const char * format, ...){ int written = 0; va_list parameters; va_start(parameters, format); while (*format != '\0'){ switch (*format){ case '%' :{ format++; switch (*format){ case 's' :{ char * str = va_arg( parameters, char *); terminal_writestring( str ); break; } case 'c' :{ //char a = (char)va_arg( parameters, int ); //terminal_putchar( a ); break; } } format++; break; } case '\n' :{ terminal_newline(); format++; break; } default :{ terminal_putchar( *format ); format++; written++; break; } } } va_end( parameters ); return written; } When tested with printf( "%s\n", "Test" ) It outputs nonsense with "S ". Only enough when I exit the function from the "%s" test case, it prints fine. If I only do 1 loop in the while function, it works fine. So my thinking is that it is one of 2 problems: a) The stack is messed up somehow b) The compiler is outputting incorrect code. Here is an objdump from printf.o: (I also included a few comments to aid reading where what is happening, for my own benefit at first) 00000000 <printf>:   #Allocated 0x18 bytes for variables    0: 55                    push   ebp    1: 89 e5                 mov    ebp,esp    3: 83 ec 18              sub    esp,0x18   #stack:   #   # 0x10   # 0x0c    # 0x08 (WORD) char * format   # 0x04   # -------------- ebp -------------   # -0x08   # -0x10   # -0x14    # -0x18 (DWORD) 0   #    6: c7 45 f4 00 00 00 00  mov    DWORD PTR [ebp-0xc],0x0    d: 8d 45 0c              lea    eax,[ebp+0xc]   10: 89 45 ec              mov    DWORD PTR [ebp-0x14],eax   13: eb 78                 jmp    8d <printf+0x8d>   # While loop   15: 8b 45 08              mov    eax,DWORD PTR [ebp+0x8]   18: 0f b6 00              movzx  eax,BYTE PTR [eax]   1b: 0f be c0              movsx  eax,al   # Test case '\n'   1e: 83 f8 0a              cmp    eax,0xa   21: 74 41                 je     64 <printf+0x64>   # Test case '%'   23: 83 f8 25              cmp    eax,0x25    26: 75 47                 jne    6f <printf+0x6f>   # Case '%' Routine:   # increment format   28: 83 45 08 01           add    DWORD PTR [ebp+0x8],0x1   2c: 8b 45 08              mov    eax,DWORD PTR [ebp+0x8]   2f: 0f b6 00              movzx  eax,BYTE PTR [eax]   32: 0f be c0              movsx  eax,al   # Test case 'c'    35: 83 f8 63              cmp    eax,0x63    38: 74 23                 je     5d <printf+0x5d>   # Test case 's'   3a: 83 f8 73              cmp    eax,0x73   3d: 75 1f                 jne    5e <printf+0x5e>   # Case 's' Routine   3f: 8b 45 ec              mov    eax,DWORD PTR [ebp-0x14]   42: 8d 50 04              lea    edx,[eax+0x4]   45: 89 55 ec              mov    DWORD PTR [ebp-0x14],edx   48: 8b 00                 mov    eax,DWORD PTR [eax]   4a: 89 45 f0              mov    DWORD PTR [ebp-0x10],eax   4d: 83 ec 0c              sub    esp,0xc   50: ff 75 f0              push   DWORD PTR [ebp-0x10] ## ds:[ebp -0x10]   53: e8 fc ff ff ff        call   54 <printf+0x54>   58: 83 c4 10              add    esp,0x10   5b: eb 01                 jmp    5e <printf+0x5e>   # Case 'c' Routine   5d: 90                    nop   # Increment format   5e: 83 45 08 01           add    DWORD PTR [ebp+0x8],0x1   62: eb 29                 jmp    8d <printf+0x8d>   #New line   64: e8 fc ff ff ff        call   65 <printf+0x65>   69: 83 45 08 01           add    DWORD PTR [ebp+0x8],0x1   6d: eb 1e                 jmp    8d <printf+0x8d>   6f: 8b 45 08              mov    eax,DWORD PTR [ebp+0x8]   72: 0f b6 00              movzx  eax,BYTE PTR [eax]   75: 0f be c0              movsx  eax,al   78: 83 ec 0c              sub    esp,0xc   7b: 50                    push   eax   7c: e8 fc ff ff ff        call   7d <printf+0x7d>   81: 83 c4 10              add    esp,0x10   84: 83 45 08 01           add    DWORD PTR [ebp+0x8],0x1   88: 83 45 f4 01           add    DWORD PTR [ebp-0xc],0x1   8c: 90                    nop   #exit-test routine   8d: 8b 45 08              mov    eax,DWORD PTR [ebp+0x8]   90: 0f b6 00              movzx  eax,BYTE PTR [eax]   93: 84 c0                 test   al,al   95: 0f 85 7a ff ff ff     jne    15 <printf+0x15>   9b: 8b 45 f4              mov    eax,DWORD PTR [ebp-0xc]   9e: c9                    leave     9f: c3                    ret   I hadn't finished the stack representation at the top, but this is as far as I got before saying I need to ask some help to solve this problem. I've been pulling my hair out over this since other peoples code that worked for them doesn't work here. So pointing toward the compiler. My compiler is an i386-elf cross compiler. stack segment is different to data segment. Other functional calls work fine. Bloody fed up of this one problem taking up most of my time thinking about what it's wrong. Any help is appreciated, rewarded with coins and other sexual favours. Thanks
It's been a while since I've done anything in C but to my knowledge you're passing a pointer with * and as such whatever is in your passed variable is what would normally be put there. But instead you're seeing what's currently taking up that spot in memory, hence jibberish. I guess the question is why are you using a pointer instead of passing by reference?
Would that be for the "char * format" or the va_args? To my knowledge, when dealing with a string of characters in C, it is of type "char *". I am not sure if it's anything to do with that since by uncommenting other parts of the code, it works. For example: int printf( const char * format, ...){ int written = 0; va_list parameters; va_start(parameters, format); while (*format != '\0'){ switch (*format){ case '%' :{ format++; switch (*format){ case 's' :{ char * str = va_arg( parameters, char *); terminal_writestring( str ); goto hell; break; } case 'c' :{ //char a = (char)va_arg( parameters, int ); //terminal_putchar( a ); break; } } format++; break; } case '\n' :{ terminal_newline(); format++; break; } default :{ terminal_putchar( *format ); format++; written++; break; } } } hell: va_end( parameters ); return written; } Works, However: int printf( const char * format, ...){ int written = 0; va_list parameters; va_start(parameters, format); while (*format != '\0'){ switch (*format){ case '%' :{ format++; switch (*format){ case 's' :{ char * str = va_arg( parameters, char *); terminal_writestring( str ); goto hell; break; } case 'c' :{ char a = (char)va_arg( parameters, int ); terminal_putchar( a ); break; } } format++; break; } case '\n' :{ terminal_newline(); format++; break; } default :{ terminal_putchar( *format ); format++; written++; break; } } } hell: va_end( parameters ); return written; } doesn't. These are both executed with "printf( "%s\n", "test")". Of course, it doesn't print the newline because I exit the while loop with "goto hell;". So I am pretty sure it is something the compiler is doing. I have turned off -02, however I need to move some code about in my bootstraper to load more than just 2kb since turning off -02 causes the code to be twice as large. So I am yet to see if -0n causes the problem, however it should none the less work regardless. Long story short, and correct me if I am wrong, I am confident that my argument types are correct, since it "CAN" work, if I do the "goto hell;" test
It shouldn't be entering the case 'c' at all, right? Are you able to step through it and confirm that it is entering the 'c' case? It sounds like for some reason it might be for some reason jumping into that case if all you're changing is commenting out the behavior for 'c'. It might also be helpful to make sure it's only entering that switch statement ONCE.
Can confirm that in this example, %c is never entered. the % switch is only entered once also. I am going to try using a different gcc version (currently 9.0.0 which is marked as experimental). I have yet to try this without optimization flags on compiler, due to the fact that my bootloader restricts me to 2kb, and 4+kb is rqeuired for no optimization. I think it's definitely some compiler trickery happening.
Sorry for the spam today guys, but I have some answers to my problem. Now this is one of those cases where me "trying" to be smart has come back to bite me on the arse. So here is what was happening: If we look closer at the Case 's' sub routine:   3f: 8b 45 ec              mov    eax,DWORD PTR [ebp-0x14]   42: 8d 50 04              lea    edx,[eax+0x4]   45: 89 55 ec              mov    DWORD PTR [ebp-0x14],edx   48: 8b 00                 mov    eax,DWORD PTR [eax]   4a: 89 45 f0              mov    DWORD PTR [ebp-0x10],eax Now, ebp-0x14 holds the char * format variable. This is a pointer, so really lets just think about this as a number. We move this into eax, and (well this next part isn't 1-1 accurate but just to paint the picture) from there we load edx with the number in eax. We store this number back to where char *format was held, so we have essentially advanced the format pointer by 2 character. This makes sense since it's the optimization of the pointer being advanced twice in 2 locations (let's just do it once in one place, and keep track). Current sit-rep: eax holds the pointer of format, edx holds the pointer of format+2 Next, we get the value at eax, and store it in eax. Now, a little background: When doing [eax + si + 4] or WHATEVER that isn't ebp or esp in the place of eax, the cpu unsterstands this as ds:[eax + si + 4]. Whereas when ebp is used, it uses the stack segment ss:[ebp + si + 4]. This is where the problem comes from. I was trying to be smart by setting up my ss register to hold a segment different to that of ds. This makes my memory layout "not flat". I was doing this so I could segregate my stack from being read or written to from normal ds style memory access (don't ask me why, because after all of this I have no idea). Now to cut a long explanation short, here is an example of where this let me down: lea eax, DWORD PTR [ ebp ] ;Load ebp into eax mov [eax], "something" ;Write to eax (aka ebp) In the case where ds != ss, then "something" would not be written to ebp as you would expect from a flat memory model, but rather to ds:[eax]. This is somewhere else then the stack in my case. Coins go to flak since he was closest by simply saying "What you're trying to access isn't there". Leaving this post for anyone else who is ill equiped for os-theory.
Sorry, you need to Log In to post a reply to this thread.