A not so unusual problem with automatic Youtube transcripts is that they can be inaccurate when they are faced with proper names, company names or specific wording/pronunciation based on the location of the speaker.
A way to fix that is to use an LLM powered by a prompt that's specific about those issues.
For instance, for proper names:
## Spellings Below is a list of names that may be mentioned in the video, followed by their incorrect spellings. Based on the context, if you find out that they are mentioned, either with these incorrect spellings or any other variations, fix the spelling. If only a part of that name is mentioned, fix the spelling of that part only. Add the corresponding link only the first time they are mentioned. Additionally, if you find out that a word has been transcribed incorrectly based on the context, you should also fix it, even if it's not in the list below. ### People * [Franco Betteo](https://www.linkedin.com/in/franco-betteo-3230bb96/) (author of the video): "Be Teo", "Ve Teo", "Betheo", "Be Theo"
This way, when the LLM encounters a variation of my last name listed or similar, it will fix it and add the link to my Linkedin too in .md format.
This same logic can be used for company names or other specific words and phrases that might appear. Could be expressions of your country, British vs American English, or prononucaitons that the transcript can fail to capture (different states in USA).
And since we are already using an LLM to clean those specifics, we can add more general things we would like to fix, such as:
* Convert all mentions of years and quantities to numeric format. * Convert "por ciento" to its symbol, "%"
You can be as creative as you want and include as many conventions you would like the transcript to follow. This pattern used across a bulk of transcripts can give you structured transcripts that follow your desired format.
Thanks to Silver.dev for the open code example of how they use this.