- consolidate target_record_sp_limit and target_get_sp_limit functions
for aarch64, powerpc, arm-ios and openbsd as there are all without
segmented stacks (no need to duplicate functions).
- rename __load_self function to rust_load_self
- use a mutex inner load_self() as underline implementation is not thread-safe