Skip to content

jjaruna/autoTranscriptGUI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

22 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AutoTranscript GUI πŸŽ™οΈ

AutoTranscript is a powerful, GPU-accelerated subtitle generator built on top of OpenAI's Whisper model. It features both a command-line interface (CLI) and a beautiful CustomTkinter-based GUI for users who prefer a graphical workflow.

Supports:

  • Languages such as: English, Chinese, Japanese, Korean.
  • Local audio/video files.
  • Translate or transcribe YouTube videos using only the link.
  • Subtitle translation to English.
  • OpenAI API (for higher quality translations) NOT AVAILABLE

✨ Features

  • πŸ–₯️ Full-featured GUI with progress tracking, real-time logs.
  • πŸ“œ Generate .srt subtitle files from media files
  • 🌍 Supports multilingual transcription and optional translation to English
  • 🧠 Uses Faster-Whisper for fast GPU-accelerated transcription

YOUTUBE TUTORIAL IN SPANISH

(https://www.youtube.com/watch?v=dB6D1i1BjXc)


πŸ“Έ GUI Preview

image


🧩 Requirements

  • Python
  • NVIDIA GPU with CUDA (recommended)
  • Visual C++ Redistributable 14

Installation for Releases

  • Extract the .rar file.
  • Go to the app folder.
  • At the top of the path bar, type cmd.
  • In the console, type: pip install -r requirements.txt.
  • Go back to the .bat file and run it.

πŸ“¦ Installation

git clone https://github.com/jjaruna/autoTranscriptGUI.git
cd autoTranscriptGUI
pip install -r requirements.txt

πŸš€ Launch the GUI

python AutoTranscriptGUI.py

πŸ” Whisper Model Comparison Summary

Model VRAM (Min) βš™οΈ Performance 🎯 Use Case 🌐 Translate into English
tiny β‰₯ 1 GB ⚑ Very Fast Quick tests, low-resource devices βœ…
base β‰₯ 2 GB ⚑ Fast Simple transcriptions, short audio βœ…
small β‰₯ 4 GB βš–οΈ Balanced Decent accuracy and speed for general use βœ…
medium β‰₯ 8 GB πŸ•’ Slower High-quality results for longer files βœ…
large-v1 β‰₯ 10 GB 🐒 Slower Older but still strong performer βœ…
large-v2 β‰₯ 10 GB 🐒 Slower More robust, especially with noisy inputs βœ…
large-v3 β‰₯ 12 GB 🐌 Slowest Highest accuracy offline, latest version βœ…
large-v3-turbo β‰₯ 8–10 GB ⚑ Fastest High-speed, high-accuracy, great multilingual support ❌

🧠 Recommendation

After testing the large-v3-turbo model more than 10 times, I can confidently say it is the fastest and most accurate among all Whisper models included in this app.

πŸ–₯️ My system has 4GB of VRAM, and despite being under the recommended VRAM for large models, large-v3-turbo still performed exceptionally well.

⚠️ Note: Your experience may vary depending on your GPU and available VRAM. Use this recommendation as a reference, not a guarantee. If you encounter performance issues, try smaller models like medium or small.


πŸ–₯️ CLI Mode (Optional)

You can still use the command-line version via autosub.py:

python autosub.py myvideo.mp4 -l ja --translate --model base

CLI Options

Option Description
filename File path
-l, --language Force language (e.g. en, es, zh)
-t, --translate Translate to English
-o, --openai Use OpenAI API
--model Whisper model to use
--debug Enable debug mode
--keep Keep intermediate WAV file

πŸ“ Output

  • Subtitles are saved as .srt files in the same folder as your media.
  • If translated, original and translated text will be preserved.

πŸ§ͺ Example GUI Workflow

  1. Open GUI
  2. Select video/audio file
  3. Choose language and Whisper model
  4. (Optional) Enable "Translate to English"
  5. Click Start Transcription

πŸ™ Credits


πŸ“„ License

MIT License β€” free for personal and commercial use.

About

πŸŽ™οΈ Powerful GUI tool to transcribe and translate audio/video files using Whisper β€” fast, simple, and GPU-optimized.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages