Edited README with the codesnippets

Browse files

Signed-off-by: taejinp <[email protected]>

Files changed (1) hide show

README.md +14 -15

README.md CHANGED Viewed

@@ -264,11 +264,8 @@ The model is available for use in the NeMo Framework[7], and can be used as a pr
 from nemo.collections.asr.models import SortformerEncLabelModel, ASRModel
 import torch
 # A speaker diarization model is needed for tracking the speech activity of each speaker.
-diar_model = SortformerEncLabelModel.from_pretrained("nvidia/diar_streaming_sortformer_4spk-v2.1")
-diar_model.eval().to(torch.device("cuda"))
-asr_model = ASRModel.from_pretrained("nvidia/multitalker-parakeet-streaming-0.6b-v1.nemo")
-asr_model.eval().to(torch.device("cuda"))
 # Use the pre-defined dataclass template `MultitalkerTranscriptionConfig` from `multitalker_transcript_config.py`.
 # Configure the diarization model using streaming parameters:
@@ -314,26 +311,28 @@ for step_num, (chunk_audio, chunk_lengths) in enumerate(streaming_buffer_iter):
                 )
 # Generate the speaker-tagged transcript and print it.
-seglst_dict_list = multispk_asr_streamer.generate_seglst_dicts_from_parallel_streaming(samples=samples)
-print(seglst_dict_list)
 ```
 ### Method 2. Use NeMo example file in NVIDIA/NeMo
-Use [an multitalker streaming ASR example script file](https://github.com/NVIDIA-NeMo/NeMo/blob/main/examples/asr/asr_cache_aware_streaming/speech_to_text_multitalker_streaming_infer.py) in [NVIDIA NeMo Framework](https://github.com/NVIDIA-NeMo/NeMo) to launch.
-```python
 python ${NEMO_ROOT}/examples/asr/asr_cache_aware_streaming/speech_to_text_multitalker_streaming_infer.py \
-          asr_model=nvidia/multitalker-parakeet-streaming-0.6b-v1 \ # Multitalker ASR model
-          diar_model=nvidia/diar_streaming_sortformer_4spk-v2 \ # Diarization model
-          audio_file="/path/to/example.wav" \ # Your audio file for transcription
-          output_path="/path/to/example_output.json" \ # where to save the output seglst file
 ```
 Or the `audio_file` argument can be replaced with the `manifest_file` to handle multiple files in batch mode:
-```python
 python ${NEMO_ROOT}/examples/asr/asr_cache_aware_streaming/speech_to_text_multitalker_streaming_infer.py \
           ... \
-          manifest_file=example.json \ # NeMo style manifest file
           ... \
 ```

 from nemo.collections.asr.models import SortformerEncLabelModel, ASRModel
 import torch
 # A speaker diarization model is needed for tracking the speech activity of each speaker.
+diar_model = SortformerEncLabelModel.from_pretrained("nvidia/diar_streaming_sortformer_4spk-v2.1").eval().to(torch.device("cuda"))
+asr_model = ASRModel.from_pretrained("nvidia/multitalker-parakeet-streaming-0.6b-v1.nemo").eval().to(torch.device("cuda"))
 # Use the pre-defined dataclass template `MultitalkerTranscriptionConfig` from `multitalker_transcript_config.py`.
 # Configure the diarization model using streaming parameters:
                 )
 # Generate the speaker-tagged transcript and print it.
+multispk_asr_streamer.generate_seglst_dicts_from_parallel_streaming(samples=samples)
+print(multispk_asr_streamer.instance_manager.seglst_dict_list)
 ```
 ### Method 2. Use NeMo example file in NVIDIA/NeMo
+Use [the multitalker streaming ASR example script file](https://github.com/NVIDIA-NeMo/NeMo/blob/main/examples/asr/asr_cache_aware_streaming/speech_to_text_multitalker_streaming_infer.py) in [NVIDIA NeMo Framework](https://github.com/NVIDIA-NeMo/NeMo) to launch. With this method, download the `.nemo` model files and specify that in the script:
+```bash
 python ${NEMO_ROOT}/examples/asr/asr_cache_aware_streaming/speech_to_text_multitalker_streaming_infer.py \
+          asr_model="/path/to/your/multitalker-parakeet-streaming-0.6b-v1.nemo" \
+          diar_model="/path/to/your/nvidia/diar_streaming_sortformer_4spk-v2.nemo" \
+          att_context_size="[70,13]" \
+          generate_realtime_scripts=False \
+          audio_file="/path/to/example.wav" \
+          output_path="/path/to/example_output.json"
 ```
 Or the `audio_file` argument can be replaced with the `manifest_file` to handle multiple files in batch mode:
+```bash
 python ${NEMO_ROOT}/examples/asr/asr_cache_aware_streaming/speech_to_text_multitalker_streaming_infer.py \
           ... \
+          manifest_file="example.json" \
           ... \
 ```