SakeTami
echohive42
echohive42

patreon


Chat with and Summarize PDF documents with Langchain and OpenAI video files:

Chat with and Summarize PDF documents with Langchain and OpenAI video files:

This is for video: https://youtu.be/JJ6ATxp42cQ

Comments

no problem. I wish could have been of more help

Echo Hive

cool! thank you for your answer :)

Ilias Mokas

hmm. I never used Azure so I cant speak to it. but you can search for this error with google and also at the issues at azure's openai github repo

Echo Hive

Thank you for quick reply. I tried this but still the same issue. I think it is because i am trying to use Azure. Maybe it needs AzureChatOpenAI? I tried the following: class Chat_With_PDFs_and_Summarize: def __init__(self, model_name="text-davinci-003", temperature=0): # initialize AzureChatOpenAI for summarization and chat self.llm_summarize = AzureChatOpenAI( model_name=model_name, deployment_name="gpt-langchain", openai_api_type="azure", openai_api_base=os.environ["OPENAI_API_BASE"], openai_api_version=os.environ["OPENAI_API_VERSION"], openai_api_key=os.environ["OPENAI_API_KEY"], temperature=temperature, ) self.llm_chat = AzureChatOpenAI( model_name=model_name, deployment_name="gpt-langchain", openai_api_type="azure", openai_api_base=os.environ["OPENAI_API_BASE"], openai_api_version=os.environ["OPENAI_API_VERSION"], openai_api_key=os.environ["OPENAI_API_KEY"], temperature=temperature, ) But still getting some errors such as: raise error.InvalidRequestError( TypeError: InvalidRequestError.__init__() missing 1 required positional argument: 'param'

Ilias Mokas

I just ran the code and it worked fine on my end. This error means that you are attempting to call a model which doesn't exist. Since we are calling the chat model, this might happen if your openai library and langchain isn't up to date. try pip installing these versions of both: openai==0.27.2 langchain==0.0.135 with these versions on my machine the code is working without any issues. Hopefully this helps. Another thing to make sure that if you are defining the models in your code explicitly then make sure the naming of the model in your code is correct.

Echo Hive

Hi echohive and thank you for the detailed explanation. I am trying to run the code but i am getting this error. openai.error.InvalidRequestError: The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again. When i set the api key because it is not personal i use the following: os.environ["OPENAI_API_KEY"] = "ABC" openai.api_key = "ABC" openai.api_type = "azure" openai.api_version = "XXX" openai.api_base = "XXXXX" Any ideas where can i look at to solve that issue? I would very much appreciate your help!

Ilias Mokas

no I just use your code and your pdf, not modified. Where could I add "delay 30s" in your code?

Tuan Tran

Make sure you are not running an infinite loop in your code or otherwise making too many requests by mistake.

Echo Hive

Rate limit error happens if you are making too many requests to the api in a given time frame. But most of the time this error is because the OpenAI API is overloaded. If you are not making too many requests with your code then trying again in a while usually solves that problem.

Echo Hive

why didn't you get this error: "Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.._completion_with_retry in 1.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-n2em2hU6UImYoI0iahrW8K2k on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.." ?

Tuan Tran

have you checked out databutton.com it seems like it would streamline the process sharing your streamlit scripts directly to patreon supporters and having them use experiment with them with less friction.

matari

might I add that you can open source it yourself as well. Just remember the good ol' echo when you do

Echo Hive

Thank you, I will consider this in the future.

Echo Hive

Should you open source this, let me know I can contribute

SHUBHAM NAGAR

You would have to check the file extension and load them accordingly. You would also have to split them in a way you wish as well. Pdf files come page by page but regular txt files don’t you can split by so many words. Or by paragraphs for example.

Echo Hive

How would i go about editing the script to handle json, csv and txt files?

matari

I am happy to hear that you got it working! Build some apps and share them at the discord too :)

Echo Hive

I set the environment variable for the OPENAI_API_KEY and then in the code I added this openai.api_key = os.environ["OPENAI_API_KEY"] for the OS library to pull it....and the script worked....thanks a ton....I am planning to see if I can make this into a Streamlit app....but use the DirectoryLoader from Langchain to auto load all the pdfs by default. I am going to try to follow your other videos to try to figure this out.....thanks a ton.

Chuck Williams

Yes. Try that. Hopefully that will work.

Echo Hive

Yes..this is strange...I set the openai.api_key variable to my actual openai api key...I didn't set an environment variable....I will comment out the open.api_key variable and try to set it as an environment variable..

Chuck Williams

If you also made sure to save the file after updating the line of code with your api key. It should work. I am not sure why this error is happening.

Echo Hive

That is a bit strange. That should work.

Echo Hive

If there isn’t any that is fine. in the absence of a line of code which explicitly defines it then the environment key kicks in automatically.

Echo Hive

I checked the code and don't see anything where the os library is checking or setting any environment variable...

Chuck Williams

I did that...set this variable openai.api_key="sk-xxxxxxxxxxxxxxxxxxxxxx" and still getting the same .......which line I should look for that checks for the key from the environment variables?

Chuck Williams

You need to remove the line which checks for the key from the environment variables if there is a line for that You need to input the key as a string such as openai.api_key = "827364g....." in between quotation marks. Also make sure to save the file before running it when you update the code.

Echo Hive

so I updated the code to import openai and now getting this error: Traceback (most recent call last): File "/home/chuckwilliams11/chat-financials/main.py", line 131, in chat = Chat_With_PDFs_and_Summarize() File "/home/chuckwilliams11/chat-financials/main.py", line 33, in __init__ self.llm_summarize = ChatOpenAI(model_name=model_name, temperature=temperature) File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__ pydantic.error_wrappers.ValidationError: 1 validation error for ChatOpenAI __root__ Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a named parameter. (type=value_error) I do have my openai.api_key = to my actual api key inside ..not sure what is causing the error. and I also ran the pip install requirements.txt --upgrade command as well to make sure libraries are updated...

Chuck Williams

I have installed openai...I ran pip install openai and I am gettingthe openai is already installed....what I might be missing..thanks a ton

Chuck Williams

also..I put in my openai key in this section open.api.key = " " I put in my openain key and keep getting this error..NameError:name 'openai' is not defined.....

Chuck Williams

Yes just update those lines

Echo Hive

Also ..can I only use the model 3.5 turbo...can I use gpt-4..do I just update this code def __init__(self, model_name="gpt-3.5-turbo", temperature=0)" todef __init__(self, model_name="gpt-4", temperature=0)

Chuck Williams

It shouldn't be too difficult. I tried an didn't get it working the way I wanted but I was short on time. gpt-4 can sure, you just need to change the model name. But be mindful of the cost.

Echo Hive

How difficult would it be to put a Streamlit UI and file uploader to this script..similar to your other video where there is a Streamlit UI with the file uploader....thanks a ton.. Also, can gpt4 answer questions across PDFs..like get answers from multiple files....thanks a ton

Chuck Williams

Were you able to fix this issue?

Echo Hive

I never saw this error before. Not sure how to approach it. Please do a google search for it. Also try to install the requirements one by one using “pip install “package” —upgrade” use double dashes before the upgrade word.

Echo Hive

Yes It would be using tiktoken to count the tokens then doing a quick math on it with embeddings cost per 1k tokens.

Echo Hive

Hi, I am getting this error when I try to install the requirements. Using conda. Dyou know what is the issue here? creating build\temp.win-amd64-cpython-311\Release\src\sentencepiece "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -IC:\Users\mikew\AppData\Local\Programs\Python\Python311\include -IC:\Users\mikew\AppData\Local\Programs\Python\Python311\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" /EHsc /Tpsrc/sentencepiece/sentencepiece_wrap.cxx /Fobuild\temp.win-amd64-cpython-311\Release\src/sentencepiece/sentencepiece_wrap.obj /std:c++17 /MT /I..\build\root\include cl : Command line warning D9025 : overriding '/MD' with '/MT' sentencepiece_wrap.cxx src/sentencepiece/sentencepiece_wrap.cxx(2822): fatal error C1083: Cannot open include file: 'sentencepiece_processor.h': No such file or directory error: command 'C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.35.32215\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2 [end of output]

Network Technician

Would it be possible to estimate the cost before doing the embeddings in?

Adolfo Rodriguez


More Creators