# Project overview You are building a social media post generator app, where users can paste text or text files (.pdf, .docx, .md) from which an X or LinkedIn post is created by the OpenAI API according to certain platform- and content specific customizations, which the user will provide.
You will be using Python, Streamlit, the OpenAI API
# Core functionalities
Single page application
- User provides OpenAI API key in a text field
- The app will check whether the API key is valid
- The app will display a message to the user whether the API key is valid or not
- User selects AI model from a dropdown field (gpt-4o-mini, gpt-4o)
- User provides the text input 1a. User inserts the text into a large text field or 1b. User drops one or multiple files into a file drop target zone
- User selects preferences for post generation
- User selects platform from dropdown (LinkedIn or X)
- User selects post style (factual or creative)
- User selects tone (professional or casual)
- User selects length in characters (280 for X, 3000 for LinkedIn)
- User clicks generate button
- The app will take the inputs from the fields and generate a prompt for the OpenAI API, based on the following l
- The app will call the OpenAI API with the prompt
- App will display the generated post in a large text field which can be copied to the clipboard
# Doc ## Documentation of how to use the OpenAI API and Streamlit app completions CODE EXAMPLE:
from openai import OpenAI
import os
import streamlit as st
# Initialize session state variables
if "messages" not in st.session_state:
st.session_state.messages = []
# Display title and instructions
st.title("Memos-to-Text")
st.markdown("""
## AI-powered Speech-to-Text Transcriptions
### What does Memos-to-Text do?
Memos-to-Text processes uploaded audio files, provides a transcription in the chat and enables interaction with it (e.g. summary, extraction of infos, generation of response drafts).
### How to use it:
- **Step 1 | Input parameters**: Provide your OpenAI API key, potentially make changes to the pre-selected parameters. More context by hovering about the info sign of each input field.
- **Step 2 | Upload Files and Receive Transcription**: Upload your audio files (mp3, wav, m4a) to trigger the transcription, which will be displayed in the chat interface.
- **Step 3 | Ask Questions**: Interact with the transcriptions (e.g. summarize, provide response, ask ChatGPT for advice on the content).
- **Important Note**: All inputs are saved in the session state, so you don't have to re-input them. They will, however, be deleted once you close the app.
---
""")
# API Key input field
api_key_input = st.text_input(
"Enter OpenAI API key",
type="password",
help="Enter your OpenAI API key to use the application"
)
# Update session state if API key changed
if api_key_input:
st.session_state.openai_api_key = api_key_input
# Initialize OpenAI client with provided API key
try:
client = OpenAI(api_key=api_key_input) # Simplified client initialization
# Test the connection with a lightweight API call
client.models.list()
st.success("✅ API key is valid --> Now upload an audio file and let the transcription be done for you!")
except Exception as e:
error_message = str(e)
if "auth" in error_message.lower():
st.error("❌ Invalid API key. Please check your key and try again.")
else:
st.error(f"❌ Error connecting to OpenAI: {error_message}")
client = None
# Model configuration
col1, col2 = st.columns(2)
with col1:
model_name = st.selectbox(
"Large Language Model (LLM)",
["gpt-4o", "gpt-4o-mini"],
index=0,
help="""
The model to use for the LLM.
gpt-4o is the most recent general model.
gpt-4o-mini is its more lightweight and cost effective alternative."""
)
with col2:
temperature = st.slider(
"Temperature",
min_value=0.0,
max_value=1.0,
value=0.1,
step=0.1,
help="Temperature controls randomness in responses. Lower values are more focused and deterministic. 0.1 is a good value for most cases."
)
# File upload and transcription section
audio_file = st.file_uploader(
"Upload an audio file",
type=['mp3', 'wav', 'm4a']
)
# Transcription button
if st.button("Start Transcription"):
if audio_file is None:
st.error("Please upload an audio file first.")
elif client is None:
st.error("OpenAI client not initialized. Please check your API key configuration.")
else:
with st.spinner(f"Processing {audio_file.name}..."):
try:
# Transcribe the audio file directly from memory
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file
)
# Format the transcription message with clear separation
transcribed_text = f"""Transcription: {transcript.text}"""
# Add transcription to messages as assistant message
st.session_state.messages.append({"role": "assistant", "content": transcribed_text})
st.session_state.messages.append({"role": "assistant", "content": """This is the transcription of your audio file. I can do many things with it: \n(1) summarize it, \n(2) extract information, \n(3) generate a response draft \n(4) ask ChatGPT for advice on the content, \n(5) ... you name it! \nWhat would you like me to do with this transcription - pick a number or write your own prompt!"""})
except Exception as e:
st.error(f"Error processing audio file: {str(e)}")
# Display chat history (this will show the transcription once)
for message in st.session_state.messages:
with st.chat_message(message["role"]):
st.markdown(message["content"])
# Chat input and response
prompt = st.chat_input("What would you like to know about the transcription?")
if prompt:
# Add user message
st.session_state.messages.append({"role": "user", "content": prompt})
with st.chat_message("user"):
st.markdown(prompt)
# Generate and display assistant response
with st.chat_message("assistant"):
stream = client.chat.completions.create(
model=model_name,
messages=[
{"role": m["role"], "content": m["content"]}
for m in st.session_state.messages
],
temperature=temperature,
stream=True,
)
response = st.write_stream(stream)
st.session_state.messages.append({"role": "assistant", "content": response})
else:
st.warning("⚠️ Please enter your OpenAI API key to use the application")
client = None