README for the »coff-avr-patch« 2003-05-16 =============================== Background ---------- In order to convert the ELF format used in the opensource AVR toolchain (commonly known as just »avr-gcc«) with its debugging information that is called »stabs« to the AVR COFF format as defined by Atmel for its AVR Studio, two conversion steps are basically necessary: the object file format itself needs to be translated from ELF to COFF, which is a relatively simple conversion for the most part, since all the contents of the various sections of the object file can be copied verbatim, and the main change is a different object file header. The »objcopy« tool that comes with GNU binutils (called »avr-objcopy« in our case) has been designed for this kind of conversions, and the GNU binutils itself could already handle more than two dozens of different COFF file formats, so for most of the part, a common subset of COFF utilities could already be used. (COFF is a very old object file format that has been in wide use by end of the 1980s/beginning of the 1990s.) There are two different file formats defined by Atmel, the original AVR COFF format, and a newer version that is understood by AVR Studio 4.07 and above called AVR »extended« COFF. In order to provide an easy way to distinguish both, two different so-called BFD names have been chose within GNU binutils, coff-avr for the former, coff-ext-avr for the latter. (BFD stands for »binary filedescriptor« and is basically the central library of the GNU binutils suite that handles all the object file IO.) In short, chose coff-ext-avr for AVR Studio 4.07 and above, and coff-avr for anything else (which includes AVR Studio 3.x, and VMLAB). The main differences are that the newer format can transport long filenames (so directory information will be retained), and has the option to debug struct types. The tough part was to convert the stabs debugging information into the so-called native COFF debugging format, which is basically an extension of the symbol table that can usually be found in object file formats. Symbol tables provide a listing of names vs. object file sections plus addresses within that section, that can be displayed with the (avr-)nm tool. Now COFF has extended the meaning of a symbol by adding some debugging information to it within the symbol table. In contrast, the approach taken with ELF is to leave the symbol table as it is (the so-called non-debugging symbols), and add the debugging information in separate sections within the object file. Since the amount of information that can be transported by native COFF debugging, the approach to use (stabs) debugging information in separate sections has also been in widespread use with COFF on Unix systems, but unfortunately Atmel had chosen native COFF debugging information instead. Fortunately, the objcopy tool already had implemented part of the task: the --debugging option was already there, and is meant to convert the debugging information from the input file into a generic internal debugging format, and from there into the debugging information required for the output file. So it was ``only'' that a COFF debugging generator needed to be written. (The source code for this converter alone is close to 100 KB, in almost 3,500 lines of C code...) The conversion from ELF/stabs to AVR COFF ----------------------------------------- The usage of objcopy to convert ELF/stabs debugging information into AVR COFF debugging information is (long lines wrapped with backslashes): avr-objcopy \ --debugging \ -O ${FORMAT} \ --change-section-address .data-0x800000 \ --change-section-address .bss-0x800000 \ --change-section-address .noinit-0x800000 \ --change-section-address .eeprom-0x810000 \ ${filename}.elf ${filename}.cof where ${FORMAT} should either be "coff-avr" (COFF format that matches the older Atmel AVR COFF documentation, as understood by AVR Studio 3, early versions of AVR Studio 4, and also by VMLAB), or "coff-ext-avr" (current AVR »extended« COFF specification, as understood by AVR Studio 4.07; adds long filenames and structure debugging). There are some more options dealing with the mapping of debugging source file names for coff-ext-avr which i can explain later if you need this (coff-avr only supports 14-char filenames, so no source file directory information can be tranferred there). In the WinAVR template Makefile, you can say »make coff« to get the resulting file ($(TARGET).cof) in the coff-avr format, and »make extcoff« to get it in the coff-ext-avr format. There might be some warnings when you run the above, like Warning: file {standard input} not found in symbol table, ignoring Warning: ignoring function __vectors() outside any compilation unit Warning: ignoring function __bad_interrupt() outside any compilation unit Perhaps more of them if your avr-libc has been installed with debugging symbols (the default WinAVR installation strips debugging symbols from the installed library files). There should be no other warning normally. Run it on your jobs, see if it works as expected. In particular, i'm interested in getting results for more complex datatypes, like arrays of pointers to function, or functions returning pointers to structures, and such. COFF woes --------- When working with COFF files, keep in mind that it's fairly limited. Basically, you can cut it down to stating that native COFF debugging can only support anything that was already defined in the very first C implementation, the so-called Kernighan/Ritchie C. There wasn't a standard for C by that time, so the de-facto standard was what has been described by the creators of C, Brian Kernighan and Dennis Ritchie, in their book ``The C programming language''. Anything that appeared later, starting with the extension by the so-called ANSI C (now called C89), is basically unsupported in COFF, because the debugging format has not been designed with extension in mind. Needless to say, this also includes any C99 or C++ extensions. Also not supported in COFF are include files. So if you've got an include file that actually produces code (e. g. in an inline function), don't be surprised that the line numbers in the object file will come out as junk. COFF tried to save space everywhere, so they defined line numbers as 16-bit quantities. In order to not limit a single source file to 64 K lines, they then defined their line numbers relative to the beginning of a function. Thus, COFF line numbers can only exist inside functions, and since functions need an input file name, if we don't have one (as it can happen in some assembler files assembled with --gstabs), we are hosed. Finally, even AVR Studio 4.07 implements only a subset of COFF debugging information. Currently still unsupported are bitfield structs, enums, and typedefs. Note that the binutils patch does support all of these features -- only AVR Studio cannot handle them. COFF file analysis using abr-objdump ------------------------------------ avr-objdump -g gives a nice pseudo-C code listing of the debug information contained in an object file. If you suspect any problems with the conversion of the debugging information, try first running this on both, the input ELF, and the output COFF file. avr-objdump -t gives the raw COFF symbol table listing. It's not completely "raw" though, since AVR COFF has so many twists and abominations compared to the historic standard COFF, and in order to be able to use all the COFF infrastructure in GNU binutils, i'm "normalizing" most of this when reading the file (at the same boundary where the byte-swapping is handled inside binutils), and de-normalize it to the AVRisms when writing it. Thus, objdump -t shows the normalized internal form. Should you ever need this, i'm happy to explain you in more detail, and give you both the current Atmel specs for AVR COFF, as well as a pointer to a generic COFF description. It'll be lengthy to explain that right now. Bug reporting ------------- If you suspect a bug, please make first sure that you've read and understood all of the above, in particular the limitations of COFF. Please also make sure that you don't just have a plain usage error. Remember that there has been a fair amount of alpha testing, so all the basic problems should have been shaken out by now. Then, try breaking down what the bug might be. Use the avr-objdump tool as described above, to see what the debugging information in both files looks like. Obviously, they will never look identical (there are two many differences in the internal representation between stabs and native COFF debugging), but they should be similar. If avr-objdump -g produces the expected result but AVR Studio doesn't work the way you think, it's most likely not a bug in the conversion, but maybe either that the compiler gave incorrect information about an object to debug (i've seen situations where a variable got optimized completely away in a simple function, so obviously, the debugging information for it was wrong), or a problem in the debugger used (like AVR Studio). Finally, try to make it reproducable. Don't send 500 KB entire project ZIP files. Keep in mind that i don't have Windows, and don't run AVR Studio -- so don't just make the assumption i could reproduce your problem. Please, break it down to the respective portion of the avr-objdump -t output (you can see the symbol name in the right column, use that for navigation), and send the portion of this dump that describes the problem. For a function, that »portion« starts at the symbol with the name of the function, and then goes from the .bf symbol to the corresponding .ef symbol. For a structure definition, it starts with the name of the struct, and goes through the .eos symbol. So in order to send a bug report, please use the address Depending on the time available, and the degree of your own investigation, i'll try to answer them. Known issues ------------ There are few remaining known issues with the generated COFF. The so-called AUX entry pointerization is sometimes wrong. According to the COFF specs, all the .bf symbols should form a linked-list, where the "next" pointer in each of them points to the next .bf. There are similar issues for some other of these records. This should normally not upset the debuggers like AVR Studio because they don't care about it, and will be fixed anytime soon (but the fix is not so easy unfortunately). There is a known and still uninvestigated issue where some bootloader code could not be debugged. There is one issue with memory de-allocation that is not yet properly handled, which is merely cosmetical since when the objcopy tool exits, the memory will of course be freed anyway. There are two known issues with the latest official (by now) version of AVR Studio 4.07's COFF parser. The first is that this parser used to assume a fixed section ordering instead of checking the COFF file header. Usually, gcc also emits the various sections in the order assumed by that parser (.text first, .data following), but under some circumstances, gcc could put .data first as well. With the buggy parser, this would make AVR Studio believe your variables were located in ROM. The second issue is that AVR Studio until now didn't support the way gcc allocates the initialization data for initialized variables: they are not issued as part of the program code, but instead need to appended manually after the program code itself. (When preparing an Intel Hex file, this is usually done by specifying both, the .text and .data section in the call to avr-objcopy.) So in the end, AVR Studio was simulating with all the initializers still using 0xff as ROM contents. There is no fix for this behaviour for AVR Studio 3.x, and/or for older versions of AVR Studio 4.x (all released versions by now), and/or for other ELF->COFF conversion tools than the GNU binutils with the current coff-avr-patch. Starting with this patch, we have negotiated our own vendor ID in the extended AVR COFF header (0x9cc, mnemonic for "gcc"), and Atmel agreed to adjust their COFF parser so it also loads the contents of .data into the simulator flash after .text when it sees this vendor ID. Both issues have been fixed internally in Atmel's sources, but by now, there is no official version of the updated COFF parser available yet. There have been rumours that they might release their own beta site to the public in order to test the updated COFF parser as well (which is just a DLL that needs to be replaced). Dresden, Germany, 2003-05-16 Joerg Wunsch