[Tutorial] Returning strings and arrays

Started by SA:MP, May 05, 2023, 05:48 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

SA:MP

[Tutorial] Returning strings and arrays

While functions in Pawn typically return only simple cell-sized values, the language does allow for (even native) functions to return arrays and therefore also strings. However, the mechanism that allows it is a bit more complex and may not be suited for some use cases.



First of all, let's look at the standard way of getting a string out of a function.



Via an output parameter

All native SA-MP functions produce a string via a standard non-const array parameter. Since arrays are passed by reference (meaning the function gets access to the actual variable and not just the value inside), the native function can easily store any data inside:


pawn Code:

new name[MAX_PLAYER_NAME + 1];
GetPlayerName(playerid, name, sizeof name);

Usually, another parameter is used alongside the array parameter to specify the length of the array. In Pawn, there is no simple way to obtain the length of an array directly at runtime, and so the compiler has to provide the length when needed (the sizeof operator). For strings, an extra cell must be allocated in the array to store the null character, indicating the end of the string (hence the + 1).



This way is especially useful for obtaining variable-length arrays (such as strings). The format function, for example, can be only implemented for this way, because the theoretical length of the output string is unlimited.



Returning an array directly

If you want to produce a fixed-size array, you can return it from a function directly:


pawn Code:

forward [MAX_PLAYER_NAME + 1]PlayerName(playerid);
stock PlayerName(playerid)
{
    new name[MAX_PLAYER_NAME + 1];
    GetPlayerName(playerid, name, sizeof name);
    return name;
}

The forward declaration is optional, but it is useful since it makes you keep in mind that the length of the array is important.



Now, if you know a bit of other languages like C etc., you may be aware that since name is allocated on the stack, it doesn't exist anymore when the function returns. Pawn gets around this fact using a trick – when the function is called, extra space for the array is allocated, and the address of this variable is provided to the function via a secret parameter.



In reality, the function looks like this:


pawn Code:

stock PlayerName(playerid, output[MAX_PLAYER_NAME + 1])
{
    new name[MAX_PLAYER_NAME + 1];
    GetPlayerName(playerid, name, sizeof name);
    output = name;
}

As you can see, returning a string is just a convenient syntactic shortcut for an extra output parameter.



However, this comes at a cost of slightly decreased performance and bugs in certain cases. Let's start with the issues:


pawn Code:

stock Select(index, arg[], arg2[])
{
    if(index == 0) return arg;
    return arg2;
}

This simple function seems to return one of the arguments, but since they have indeterminate sizes (represented as 0), the compiler thinks this function returns a zero-sized array and does not actually allocate any extra space for the string. You cannot return strings (or arrays) that have indeterminate length, and the compiler sometimes fails to inform you about this.




pawn Code:

forward [4]Func1();
stock Func1()
{
    new str[] = "abc";
    return str;
}

forward [4]Func2();
stock Func2()
{
    return Func1();
}

This code is also horribly wrong, but in a subtle way. Func2 does allocate extra space for the array returned from Func1, but before the array can be copied to the secret output array of the second function, it is deallocated again and isn't accessible anymore.




pawn Code:

stock Func(...)
{
    new str[] = "abc";
    return str;
}

The extra output argument is placed at the end of all arguments, even after the variadic ones. However, the compiler fails to correctly obtain the return address in this case, and assumes its position is constant.



In all these cases, returning a string is a really bad thing to do, since the code usually compiles fine, and the issue only becomes apparent at runtime.



There is a slight performance cost associated with returning arrays as well: the output = name; always happens if the function is implemented in Pawn, and so the array is copied at least once. Take a look at this code:




pawn Code:

new name[MAX_PLAYER_NAME + 1]; // = PlayerName(playerid); doesn't work, direct assignment isn't supported
name = PlayerName(playerid);

The compiler again allocates extra space for the string returned from PlayerName (always on the heap), before it moves it to name. Therefore, the array has to be unnecessarily copied twice before it is usable.



As you can see, returning arrays has some significant drawbacks, but it is still useful in some cases, when you are cautious.


pawn Code:

format(string, sizeof string, "Your name is %s.", PlayerName(playerid));

This is the intended usage of returning arrays – as temporary arguments to other functions. In all other cases, using a normal output parameter is safer and faster. In this case, no extra copying happens at the caller's site.



PlayerName itself can be "fixed" to return the string directly without unnecessary copying, via in-line assembly:


pawn Code:

forward [MAX_PLAYER_NAME + 1]PlayerName(playerid);
stock PlayerName(playerid)
{
    #assert MAX_PLAYER_NAME + 1 == 25
    #emit PUSH.C 25 // size parameter of GetPlayerName
    #emit PUSH.S 16 // secret return parameter of PlayerName at address 16 (&playerid + 4)
    #emit PUSH.S playerid // equal to 12
    #emit PUSH.C 12 // number of bytes passed to the function (4 * 3 arguments)
    #emit SYSREQ.C GetPlayerName // calling the function
    #emit STACK 16 // cleanup of the arguments from the stack
    #emit RETN
}

This way, PlayerName does no extra copying, passing the secret return address directly to GetPlayerName.



PawnPlus strings

Dynamic strings in PawnPlus offer the flexibility and convenience of normal values, since they are passed around as references.


pawn Code:

stock String:Func()
{
    return str_new("abc");
}

The string can be returned from functions, passed to other functions or even to native functions, inspected, or modified, and all of this without any additional copying.



Constant strings

All the previous methods are suitable for strings that are produced at runtime, but for constant strings, I'd advise against using any of them. Use macros instead.




pawn Code:

new const _tips[][] = {
    "Tip one",
    "Tip two",
    "Tip three",
    "Tip four"
};
#define RandomTip() (_tips[random(sizeof _tips)])

Using this in functions like SendClientMessage requires no string copying at all, since the address of the string is used directly. You can also afford to use variable-length strings in the array, resulting in a little more efficient storage.

Source: [Tutorial] Returning strings and arrays