-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
feat(api): Add transcribe response format request parameter & adjust STT backends #8318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
(cherry picked from commit e271dd7) Signed-off-by: Andres Smith <[email protected]>
(cherry picked from commit 6a93a8f) Signed-off-by: Andres Smith <[email protected]>
(cherry picked from commit f25d1a0) Signed-off-by: Andres Smith <[email protected]>
✅ Deploy Preview for localai ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
…lso work on CLI Signed-off-by: Andres Smith <[email protected]> (cherry picked from commit 69a9397) Signed-off-by: Andres Smith <[email protected]>
77bf6a3 to
f0e5b46
Compare
I'm totally fine to do it in a separate PR, for now it's looking good!
That's nice, thank you for switching to the official client (we should probably do the same across the codebase).
I don't see these changes in the PR - is this intentional? In any case, it looks good here - thanks! |
Description
Closes #1071.
This PR adds support for the
response_formatrequest parameter in the transcription endpoint of the API, and to thetranscribeCLI command, in accordance with the official OpenAI API (with the addition of thelrcformat, which I am particularly interested in :) ). The responses of the transcription endpoint now mirror the behaviour of the official API, with the exception of the behaviour when the parameter is omitted, which in our case will do what it did previously, to not break existing use cases.The start/end values for each segment have also been adjusted in the
whisperandfaster-whisperbackends, since they were returning values that when converted to thetime.Durationfield in the main application yielded incorrect values.In the
faster-whisperbackend, thecompute_typewas changed todefaultfromfloat16since I was mistakenly running the model on CPU and it was failing due to my CPU not supporting float16 properly. Once the model was configured to usecuda, it worked fine, but this means thatfaster-whispercurrently won't work on some CPUs.defaultworks on all devices, we can add a check for thef16config if we want to enable float16 support in this backend, either in this PR or in a separate PR.The tests were updated to use the official OpenAi Go client, since the one used before did not support the response format request param properly.
I also took the liberty of restricting certain CI workflows to only run on the main repo, since our forks will not have the credentials to run them correctly, nor should they. If you'd rather I remove these changes, I'll undo them, it's just for convenience to avoid the email spam for all the pipelines that fail constantly.
Notes for Reviewers
These changes were tested against the
whisperandfaster-whisperbackends. I was unable to test withqwen-asr.The AIO tests were also run successfully.
Signed commits