It helps to talk it out.
When you’re writing a first draft, processing an emotional experience, thinking about a tough problem at work, trying to make a decision.
The thoughts and feelings are ineffable inside of you.
It’s a process to draw them out into spacetime.
You can write, but longhand and typing are slower than your rate of thought. Also, the blank page is daunting.
It’s much easier to just start talking.
The problem is you can’t remember what you’ve said.
And that’s where transcription comes in.
How I started using transcription
I started using transcription for writing.
At first, I would record the audio, then play it back and type out what I heard. Lots of pausing and rewinding.
Then I discovered there were softwares that could do the recording and transcribing at the same time.
For a while, I used the speech-to-text feature built into the text messaging app on my phone. I wrote an entire book using this method.
Now I use Otter.
Use cases
The fundamental use case is quickly and easily getting your thoughts and feelings into words.
This is obviously helpful for writing. I can finish a first draft 10x faster with transcription versus typing.
I realized there were more use cases when I started to use transcription for personal journaling.
Journaling is good for emotional processing, increased self-awareness, creative expression, goal setting, and overall improved mental health.
In my experience, the benefits of journaling are even greater when you journal by speaking aloud. Perhaps because you feel more heard (even if just by yourself).
These are some of the use cases I have in mind for speech-to-text transcription:
Writing
Idea generation
Problem solving
Decision making
Journaling
Catharsis
Self-guided “therapy” (disclaimer: not medical advice, please consult a professional)
Existing transcription solutions
This blog post by Zapier has a great breakdown of the available transcription services.
There are basically three categories:
Human
AI / software
Human + AI / software
In the past, human scribes were the only option for transcription.
Now, AI and other softwares provide completely automated transcription.
But the technology still makes mistakes:
Mishearing: often a result of poor audio quality or slurred speech
Omissions: missing certain words or phrases
Incorrect word choices: confusing homonyms or mistaking proper names for common words
Lack of context: lacking the prior knowledge to make the correct word choices
Punctuation errors: omitting punctuation or choosing the wrong punctuation mark
Formatting errors: inconsistent spacing or incorrect capitalization
The most accurate transcription services use a combination of humans and technology. For example, this is the copy on the TranscribeMe website:
We deliver the most accurate transcriptions at competitive rates due to a combination of the latest in AI, paired with our trained & experienced transcriber network.
My guess is that the bulk of the transcribing is done by the AI technology, and then human transcribers review the AI-generated transcripts to fix errors.
Shortfalls of existing solutions
With normal transcription, it’s just you talking. You say what you say and the service converts your speech to text.
I’m thinking of a few areas where a transcription service could go above and beyond:
Asking questions: when you pause and don’t know what to say next
Prompting: other than questions, e.g., encouraging, suggesting
Reminding: bringing up something you said before
Pointing out inconsistencies: identifying contradictions with what you said before
Making you feel heard: like you’re talking to another human
Converting your speech into text saves you time. It allows you to write with your voice, rather than longhand or typing.
But it’s still only your intelligence that’s coming up with the words.
I think the transcription service can do more, to the extent that the interaction would be more like a conversation.
It can help to have a consciousness other than your own that is analyzing what you’re saying and providing another perspective.
Someone else asking you questions would be particularly helpful. It’s like a reverse AI chatbot. Instead of the human prompting the AI to get answers from the AI’s knowledge and analysis of a huge dataset of text. The AI would ask the human questions to further explore the human’s consciousness, guiding the human deeper into their exploration of their own mind.
Possible overlap with therapy
I’ve written more about this here:
And here:
Disclaimer: This is not medical advice. Please consult a professional.
Idea for a supplemented transcription service
Let’s start with human transcription: someone listens to what you say and writes it down.
Now imagine that the scribe interacts with you: asking questions, prompting, reminding, pointing out inconsistencies, identifying contradictions with what you said before, making you feel heard.
The human scribe can’t interact with you and take notes at the same time, so they use a software to record and transcribe the audio of your conversation.
After the conversation, the same human scribe is the one who edits the transcript. They have context from the conversation. They could even add additional notes and comments.
Then the human scribe sends the edited transcript to the speaker.
Outline of minimum viable product
Schedule a 30-minute meeting via Calendly
Some things we can discuss on the call:
Writing a first draft
Thinking through a problem
Making a decision
Getting feedback on something
Strategizing and coming up with a plan
Before the call, fill out a form to answer these questions:
Why did you schedule this call?
What would you like to discuss on our call?
What are you hoping to achieve from this call?
Speaker can assign roles to the transciber:
Only listen
Listen and ask questions
Listen, ask questions, and give advice
Call will be on Google Meet or Zoom
Use Otter to transcribe the audio during the call
After the call, edit the transcript for omissions, incorrect word choices, punctuation errors, formatting errors, spelling, and grammar (no changes to the originally spoken words)
Send transcript via Google Docs or Notion to allow for continued conversation via commenting
The cost will be $50 per 30-minute call
Why I think people will find this valuable
People want to be heard, but they also want to be engaged with, to have what they share analyzed to an extent, but without judgment, with the objective light of pure consciousness shining back on them.
I think there’s immense value in getting your thoughts out of your head.
When you see your thoughts written down as words, you process them differently.
I’ve done this several times with people where I listen to them and take a note of exactly what they said and then read it back to them and they’re like, “Woah, I said that?”
Sometimes you don’t even know what you’re thinking or feeling until you write it down.
And this is all in addition to the baseline value of making the leap from the mental and emotional contents trapped inside of you to words written down. Many struggle to make that leap, but transcription makes it quicker and easier.
Related posts
DISCLAIMER: THIS IS NOT MEDICAL ADVICE
The information on this website is for informational purposes only. No material on this site is intended to be a substitute for professional medical advice, diagnosis or treatment. Always seek the advice of your physician or other qualified health care provider with any questions you may have regarding a medical condition or treatment and before undertaking a new health care regimen, and never disregard professional medical advice or delay in seeking it because of something you have read on this website.
interesting idea. i am pretty sold on the concept of a transcription assistant that will also help point out inconsistent logic and/or ask questions to help extract the essence of the idea. but i would need that on-demand, when the motivation is there. scheduling a call would be a dealbreaker for me
fwiw i think using a purpose built AI for this is a phenomenal use case to allow scalability. basically just otter, but with some live feedback. I’m quite confident this is possible using the current state of open source technology
as an aside, i’ve been doing transcription based notes for a while at your recommendation. i will talk it out for two or three sessions, then use the transcription of the last one as my outline for writing, since it’s normally much more cohesive