Resolved: arm cortex-m33 (trustzone, silabs efm32pg22) – assembler hardfaults accessing GPIO or almost any peripherals areas, any hint?

In this post, we will see how to resolve arm cortex-m33 (trustzone, silabs efm32pg22) – assembler hardfaults accessing GPIO or almost any peripherals areas, any hint?

Question:

I am just lost here with this code trying to configure on baremetal the silicon labs efm32pg22 in theirs devkit accessed through internal J-Link from segger studio (great fast ide) – I have such example blink hello world in C working from theirs simplicity studio, but was trying to achieve the same thing I did on microchip pic32 mc00 or samd21g17d easily in pure assembler, having only clocks and startup configured through gui in mplab x… well, here I tried to go to segger IDE where is NO startup/clocks config easy way, or I didnt found it yet. On hardware level, registers of such cortex beasts are different by manufacturer, in C/C++ there is some not cheap unification over cmsis – but I want only to know what minimal is needed to just have working raw GPIO after clock/startup … Segger project is generic cortex-m for specific efm32pg22 so cortex-M33 with trust-zone security – I probably dont know what all is locked or switched off or in which state MCU is, if privileged or nonprivileged – there are 2 sets of registers mapping, but nothing works. As far as I try to “store” or even “load” on GPIO config registers (or SMU regs to query someting too) it is throw hardfault exception. All using segger ide debugger over onboard j-link. Kindly please, what I am doing wrong, whats missing here?
in C, I have only this code:
In blink.s I have this:
having this code in blink.s: – and here it works this way and blinks …
… so NOW, I am just curious, what all is missing in pure assembly code to bring that cortex-m33 into some “easy” state, just ignoring trustzone, probably to use it similary as say, plain cortex-m3 ??
can anybody help? I am digging deeply into this datasheet/ref manual, but no luck till now … https://www.silabs.com/documents/public/reference-manuals/efm32pg22-rm.pdf
UPDATE AGAIN: umm, will try to figure out … by traversing system_init C-code its clear whats going on, there are also some chip errata workarounds, but I never touched DCDC while initializing, this may be culprit…

Best Answer:

well, okay, manufacturer specific code generation for MCU startup IS really important and useful thing )) … such MCUs from different manufacturers are really much different at registers level (even that all are “cortex-m” core based), that its worthless to try to configure them manually in assembly if there is enough flash available, and it mostly IS. So, till now, no luck with segger/keil/iar “generic” arm/cortex IDEs to do this properly on specific parts, so using manufacturer specific IDE to (mostly) graphically configure startup clocks and peripherals IS CRUCIAL, or at least, its really easiest way (I know, quite expensive observation after all the assembly tries… )). After then, its easy to make even pure assembly “blink” helloworld test called as extern C-function. You may be asking why I am still considering assembly if there are even CMSIS (on arm) “platform abstraction layer” C-headers at least (no, it doesnt help in abstraction, as the devices are still very different, you only have registers symbols #defines and typedefs and enums to do something in C easily, okay). But I am trying to compare some C-compiled code with handwriten assembly for some specific purpose, which needs forced optimized algorithm from scratch and its often quite easier to think/design it directly in assembly that to rely on very complexly described C-compiler optimisations (each compiler has its own LONG document how his optimisations work and at this level, C is simply still too abstract and moving target, the more, you try to write something for even different MCU architectures (think ARM cortex-m, PIC32/mips, and/or even PIC16/18 + PIC24, AVR , MSP430 …) – while general algorithm may be described in shared pseudoassenbly to be as near to hardware as possible, withnout knowing all optimization quirks of each architecture C compiler(s) – there are often MORE different C compilers too. So, to compare C-compiler generated code with handwriten assembly you can do it, and I already tried such assembly blink on MANY VERY different architectures, in case I definitelly used mfg specific IDE to genearte startup in C, using all the GUI configurations and code generation down to always compilable empty C project, of course, having very different code size output using such generated startups. Most advanced MCUs are really very complex, mostly in clocks configuration and pins functions config and then different peripheral devices too, sure. Some similarities are possible only at single mfg level, to some extent, so MCU of single manufacturer often share similar approach, obviously. So final solution is to have startup generated and then switch to assembly immediatelly, this is feasible. Sure that in case of small flash, its further possible to optimize even startup code, but its mostly important on smallest 8bit parts, where startup IS quite easy anyway or the generated code is also small, obviously.

If you have better answer, please add a comment about this, thank you!

Source: Stackoverflow.com