How to Read a .wav file or other audio file

Hi, Not sure this is the right place to ask but I am trying to find a way of reading audio signal from a .wav file in lpcm format into an array or data file rather than for playing so that I can present audio waves in a visual form or to analyse the signal in voice recognition or modifing the sound. If anyone can help or point me in the right direction I'd be grateful.
Thanks Richard W.

Comments

  • Dickyw42 wrote: »
    I am trying to find a way of reading audio signal from a .wav file in lpcm format into an array or data file

    There are several ways this could be approached, some using only native BBC BASIC code and others relying on functions in (for example) the SDL2 library.

    The native-code approach relies on knowing that a WAV file has a header with a well-defined format. So I would start by defining a structure corresponding to that header (caution: this code isn't compatible with lowercase keywords):
          DIM wav{riff%, total%, wave%, fmt%, fmtsize%, wFormatTag{l&,h&}, nChannels{l&,h&}, \
          \       nSamplesPerSec%, nAvgBytesPerSec%, nBlockAlign{l&,h&}, wBitsPerSample{l&,h&}, \
          \       data%, datasize%}
    

    Having defined the header format, you can read a corresponding amount from the start of the file, most conveniently into a string:
          wav% = OPENIN(wavfile$)
          IF wav% = 0 ....  REM handle the case when the file couldn't be opened
          header$ = GET$#wav% BY DIM(wav{})
    

    Then point the structure to what has been read from the file:
          PTR(wav{}) = PTR(header$)
    

    Now you can conveniently read the parameters of the file directly from the structure, for example:
          PRINT "Sampling rate = "; wav.nSamplesPerSec%; " samples per second"
    

    At this point you can handle the actual audio data any way you want, either by loading it all into memory if it will fit (remember than you can raise HIMEM) or loading only a 'chunk' at a time if it won't.

    Hopefully this will have given you some ideas, at least.
  • Great thanks for that. I'll experiment with that ..Thanks again
  • Hi Richard,
    Thanks again for your suggestions. I spent some time trying to make sense of the raw .wav data in a recording I made using the recorder in the bbcsdl20 examples folder. I finally read up a bit more on pcm formatting... I think my file (Aliens.wav) has two, 2 byte audio channels, interleaved, with the most significant bit as the sign bit. I graphed the output, as shown by screenshots below

    5bx2fdr5b3iw.jpg
    be7l9dvxxurd.jpg

    I used this code to load the header as you suggested

    wavfile$="fours/Aliens1.wav"
    DIM wav{riff%, total%, wave%, fmt%, fmtsize%, wFormatTag{l&,h&}, nChannels{l&,h&}, \
    \ nSamplesPerSec%, nAvgBytesPerSec%, nBlockAlign{l&,h&}, wBitsPerSample{l&,h&}, \
    \ data%, datasize%}
    wav% = OPENIN(wavfile$)
    IF wav% = 0 THEN PRINT "Could not load File: ";wavfile$ :PRINT ERR:STOP
    header$ = GET$#wav% BY DIM(wav{})
    PTR(wav{}) = PTR(header$)

    I used the code below to load the data from the file after reading the header
    LR=0
    REPEAT PLS=PLS+1
    PULSE$(PLS,0)=GET$#wav% BY 2
    LR=1+PLS MOD 2
    PULSE$(PLS,LR)=FNSTR2VAL(LEFT$(PULSE$(PLS,0),2))
    UNTILPLS>999999 OR EOF#wav%
    CLOSE#wav%
    .......
    ........
    STOP

    DEF FNSTR2VAL(AB$)
    LXV=0
    CODSUM=0
    SSGN=0
    FOR ZXZ=1 TO 130:PVQ$(ZXZ)="":NEXT

    REM NUMBERING RIGHTMOST BYTE AS ZERO THEN WORKING LEFT, BYTENUM =0,1,2,3 ETC ..
    REM THEN MULTIPLYING THE ASCII CODE OF EACH BYTE BY 256^(BYTENUM)

    BBN2=ASC(MID$(AB$,1,1))
    IF BBN2>=128 THEN

    REM CONVERTING RAW BINARY IN STRING INTO AN ACTUAL SIGNED DECIMAL VALUE
    REM FROM 1111111111111111 (-32767) TO 01111111111111111 (+32767) MSB = SIGN BIT

    SSGN=-1
    NEG=NEG+1
    ELSE
    SSGN=1
    ENDIF

    FOR LXX=LEN(AB$) TO 1 STEP -1
    LLXX=LEN(AB$)-LXX
    IF SSGN=1 THEN
    PVQ$(LXX)=STR$(ASC(MID$(AB$,LXX,1)))
    CODSUM=ASC(MID$(AB$,LXX,1))*(256^(LLXX))+CODSUM
    ENDIF

    IF SSGN=-1 THEN
    REM REMOVING THE SIGNBIT (MSB OF THE 2 BYTES)

    IF LXX=1 THEN
    BBN3=(ASC(MID$(AB$,1,1))-128)
    MID$(AB$,1,1)=CHR$(BBN3)
    ENDIF

    PVQ$(LXX)=STR$(ASC(MID$(AB$,LXX,1)))
    CODSUM=ASC(MID$(AB$,LXX,1))*(256^(LLXX))+CODSUM
    ENDIF
    NEXT LX

    IF SSGN<>0 THEN CODSUM=CODSUM*SSGN
    PVQ$(0)=STR$(CODSUM)
    =PVQ$(0)


    hope this is ok?
    My objective is to write code that can identify numbers from short sections of speech. This could be well be too ambitious but worth a try anyway. If you have any suggestions, that could be very useful. I have learned so much so far.

    Thanks.



Sign In or Register to comment.