GET$# inconsistency

I've noticed an inconsistency in the way GET$# works in ARM BASIC V and BBC BASIC (Z80) v5.00 - the only two versions of BBC BASIC which support GET$# whilst also imposing a string length limit of 255 bytes, I think.

The difference can be illustrated by writing two 255-byte strings using BPUT# and reading them back using GET$#, for example:
   10 F%=OPENOUT"TEMP"
   20 BPUT#F%,STRING$(255,"A")
   30 BPUT#F%,STRING$(255,"B")
   40 CLOSE #F%
   50 F%=OPENIN"TEMP"
   60 A$=GET$#F%
   70 PRINT LEN A$
   80 B$=GET$#F%
   90 PRINT LEN B$
  100 CLOSE #F%
In BBC BASIC (Z80) this code outputs:
       255
       255
but in ARM BASIC V it outputs:
       255
       0
By contrast if you run this code:
   10 F%=OPENOUT"TEMP"
   20 BPUT#F%,STRING$(255,"A");
   30 BPUT#F%,STRING$(255,"B");
   40 CLOSE #F%
   50 F%=OPENIN"TEMP"
   60 A$=GET$#F%
   70 PRINT LEN A$
   80 B$=GET$#F%
   90 PRINT LEN B$
  100 CLOSE #F%
BBC BASIC (Z80) outputs:
       0
       254
whilst ARM BASIC V outputs:
       255
       255
I'm not going to pass judgement on whether one is 'right' and the other is 'wrong', but it's noteworthy - and arguably unfortunate - that there is a difference.

I would add that it's possible to make BBC BASIC (Z80) return the same result as ARM BASIC V in the latter case by making the following modification:
   10 F%=OPENOUT"TEMP"
   20 BPUT#F%,STRING$(255,"A");
   30 BPUT#F%,STRING$(255,"B");
   40 CLOSE #F%
   50 F%=OPENIN"TEMP"
   60 A$=GET$#F% BY 255
   70 PRINT LEN A$
   80 B$=GET$#F%
   90 PRINT LEN B$
  100 CLOSE #F%

Comments

  • In the ARM BASIC versions, GET$#chan% stops after 255 characters. Thus the file pointer is left pointing at the &0A character terminating the string. The second GET$# reads the &0A and returns an empty string.

    Line 1375 of this source file https://gitlab.riscosopen.org/RiscOS/Sources/Programmer/BASIC/-/blob/master/s/Factor confirms this hard stop at 255 characters.
  • Soruk wrote: »
    In the ARM BASIC versions, GET$#chan% stops after 255 characters.
    Yes, but this results in a highly non-intuitive and likely to be unexpected behaviour, which is why BBC BASIC (Z80) doesn't work that way.

    Suppose you have a set of arbitrary strings - perhaps they are read from an external data source, or entered by the user, or something. You would expect to be able to write them to a data file with BPUT# and read them back from that file at a later date with GET$#, and get back what you started with.

    But if any of those strings is 255 bytes long that's not what will happen: although that string will be read correctly all the subsequent strings will be corrupted. What's even worse, this is likely to happen 'silently' and you'll end up with scrambled data for no obvious reason.

    How would you code this in ARM BASIC V so that any arbitrary string can be stored (in a 'compatible', LF-terminated, file) and then read back successfully?
  • Possibly something like
    IF BGET#chan% <> &0A THEN PTR#chan% = PTR#chan%-1
    
    immediately after the read. And, it should hopefully be cross-platform compatible.
  • Hated_moron
    edited December 2024
    Soruk wrote: »
    Possibly something like
    IF BGET#chan% <> &0A THEN PTR#chan% = PTR#chan%-1
    
    immediately after the read. And, it should hopefully be cross-platform compatible.
    Unless I've misunderstood, which is entirely possible, that simply transfers the issue from failing if any of the strings is 255 bytes long, to failing if any of the strings is zero bytes long. Am I wrong?
  • Soruk
    edited December 2024
    Yes, good spot. That last check should only be run if the last string is 255 characters in length. Therefore, on Acorn kit the BGET will see the LF, with the side-effect of moving the pointer on, and Z80 V5 won't return the LF, so the pointer will be rewound by one byte. On any shorter string, the LF will be ready as the string terminator on both platforms, leaving the file pointer in the right place automatically.
  • Soruk wrote: »
    on Acorn kit the BGET will see the LF, with the side-effect of moving the pointer on, and Z80 V5 won't return the LF, so the pointer will be rewound by one byte...
    But if the next string in the file really is an empty string, in ARM BASIC the BGET will read the LF that wasn't read by the GET$ but BBC BASIC (Z80) will read the LF that terminates the following empty string!

    If your objective is to write compatible code, you'd have to note the file pointer before reading the string. Then if the string length is 255 bytes you would set the file pointer 256 bytes beyond that. This would be a no-op on the Z80 but would advance the file pointer past the LF on the ARM:
    P% = PTR#F% : A$ = GET$#F% : IF LEN(A$)=255 THEN PTR#F% = P% + 256