-
Notifications
You must be signed in to change notification settings - Fork 4
Enable using speech processor in Docker by using HTTP #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
This depends on #12 |
sarapapi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have a few questions/clairification comments, but in general, it's good for me
| docker run --rm --gpus=all -p 8080:8080 http_speech_processor | ||
| ``` | ||
|
|
||
| And then, you can use `simulstream` setting the proxy HTTP processor to access your |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| And then, you can use `simulstream` setting the proxy HTTP processor to access your | |
| And then, you can use `simulstream` by setting the proxy HTTP processor to access your |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here simulstream is the command, not the tool
| @@ -0,0 +1,32 @@ | |||
| # Example of Docker Speech Processor | |||
|
|
|||
| This folder contains a Dockerfile that is a working example of how to build a Docker | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| This folder contains a Dockerfile that is a working example of how to build a Docker | |
| This folder contains a [Dockerfile](examples/http_docker/Dockerfile) that is a working example of how to build a Docker |
| --metrics-log-file $YOUR_OUTPUT_JSONL_FILE | ||
| ``` | ||
|
|
||
| Please notice that this example Dockerfile runs a Canary sliding window speech processor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Please notice that this example Dockerfile runs a Canary sliding window speech processor. | |
| Please notice that [this Dockerfile example](examples/http_docker/Dockerfile) runs a Canary sliding window speech processor. |
| self.close_session(session_id) | ||
|
|
||
| def shutdown(self) -> None: | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either you add comments like this to all methods, or you drop this one that is pretty redundant
|
|
||
| def get_speech_chunk_size(self, session_id): | ||
| processor = self.speech_processor_manager.get(session_id) | ||
| self._send_json_response(200, {"speech_chunk_size": processor.speech_chunk_size}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this 200?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are stardard HTTP codes. 204 means no content, 200 is just "everything good". This is standard HTTP protocol...
| processor = self.speech_processor_manager.get(session_id) | ||
| output = processor.process_chunk( | ||
| np.frombuffer(base64.b64decode(waveform), dtype=np.float32)) | ||
| self._send_json_response(200, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
| def put_source_language(self, session_id, language): | ||
| processor = self.speech_processor_manager.get(session_id) | ||
| processor.set_source_language(language) | ||
| self._send_json_response(204) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment about the meaning of the response
| yaml_config(args.speech_processor_config), server_config.pool_size, server_config.ttl | ||
| ) | ||
| speech_processor_loading_time = time.time() - speech_processor_loading_time | ||
| LOGGER.info(f"Loaded speech processor in {speech_processor_loading_time:.3f} seconds") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the time to load the model, also, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
Co-authored-by: sarapapi <57095209+sarapapi@users.noreply.github.com>
Why is needed?
For sharing and reusing speech processors, it is convenient to be able to run speech processors in docker, so that the full environment is available. This is important e.g. for IWSLT campaigns, where organizers need to run participants' solutions.
What does the PR do?
It creates a new gRPC-based speech processor and a HTTP server that exposes a configured speech processor. In this way, the simulstream server/inference can be run setting the HTTP-based speech processor adn configuring it to communicate with the HTTP server that can run in a Docker container.
How is this documented?
Updated documentation and added an example to build a docker in the repo.
How was the PR tested?
Manual runs.