Speech-to-Text Web Integration
Summary
Freelancer Client is hiring: Speech-to-Text Web Integration.
Location: Remote
I’m upgrading my site so visitors can upload either legal or general recordings—audio or video—and receive an automatic transcript that is courtroom-ready or publication-ready, depending on the option they choose.
What you'll do:
• 1. Speech-to-text engine
• Accepts both legal and general recordings in common formats (MP3, WAV, MP4, MOV, etc.).
• Delivers very high accuracy by leveraging a proven API or an on-prem model such as Google Speech-to-Text, AWS Transcribe, Whisper, Kaldi, or a comparable solution you recommend.
• Outputs two switchable templates:
• Legal: numbered lines, speaker identification, and time-stamped entries.
• General: speaker identification with clean paragraph formatting.
• Template choice is made by the end user before checkout.
• 2. Front-end upload & order form
Skills: PHP, JavaScript, Transcription, HTML, Web Development, Payment Gateway Integration, Frontend Development, API Development
Budget: $15–$25 USD
Source: Freelancer Client via Remote / Online. Apply on the source website.
Original
I’m upgrading my site so visitors can upload either legal or general recordings—audio or video—and receive an automatic transcript that is courtroom-ready or publication-ready, depending on the option they choose.
Here is the workflow I need built and installed:
1. Speech-to-text engine
• Accepts both legal and general recordings in common formats (MP3, WAV, MP4, MOV, etc.).
• Delivers very high accuracy by leveraging a proven API or an on-prem model such as Google Speech-to-Text, AWS Transcribe, Whisper, Kaldi, or a comparable solution you recommend.
• Outputs two switchable templates:
– Legal: numbered lines, speaker identification, and time-stamped entries.
– General: speaker identification with clean paragraph formatting.
• Template choice is made by the end user before checkout.
2. Front-end upload & order form
• Drag-and-drop or file-picker upload.
• Drop-down to select “Legal” or “General” plus any optional metadata fields you advise.
• Real-time price display.
3. Secure payment step
• Processes the order through a mainstream online gateway (Stripe is my first choice, but I’m open to PayPal or Authorize.Net if integration is faster).
• Confirms the transaction and triggers transcription automatically.
4. Delivery
• Email and on-screen download link once the transcript is generated.
• Admin console where I can view, override, or regenerate any job.
Acceptance criteria
• 95 %+ word-accuracy on clear audio.
• Perfect compliance with the formatting specs above.
• End-to-end turnaround (upload to delivery) demonstrably functional on my live domain.
I will supply sample formatted transcripts, brand colors, and server access the moment we start. If you’ve integrated speech-to-text solutions before and can hit the accuracy and formatting marks, I’m ready to move quickly.
Location & Details
Apply on source →About this listing
This remote opportunity was imported from Freelancer and is shown here for discovery. To apply, follow the link to the original posting.