This section will give you a brief overview of the 2markdown API.
To convert a URL to markdown you need two things.
X-Api-Key
containing your keyParameter | Description |
---|---|
url | The URL to be processed |
curl --request POST \
--url https://2markdown.com/api/2md \
--header 'Content-Type: application/json' \
--header 'X-Api-Key: YOUR_API_KEY' \
--data '{"url": "https://newsletterify.com"}'
The response will be a JSON object with the following fields.
Field | Description |
---|---|
article | The essential content of the website as plaintext |
error | In case of an error this field is present and gives an indication of what the problem is. |
The following are status codes the server might answer with.
Status Code | Description |
---|---|
200 | The request is valid, counts towards your quota. |
400 | The request is invalid. Check the documentation above for the structure of valid requests Make sure the url is valid. |
422 | We couldn't extract enough content. Does not count towards your quota and is not billed. |
500 | Something went wrong. Please try again later. |
{
"article": "## Empower Your Knowledge, Unclutter Your Inbox\n\n## ..."
}
You can use 2markdown with the LangChain document loader as follows. For more information, visit
First, install langchain with pip install langchain
(more information here
Then, you can use the following code to extract the article from a website:
from langchain.document_loaders import ToMarkdownLoader
# https://2markdown.com
md_api_key = "YOUR_KEY"
loader = ToMarkdownLoader(
url="https://newsletterify.com",
api_key=md_api_key)
docs = loader.load()
# the loader returns an article containing the markdown
print(docs[0].page_content)
# output:
# ## Empower Your Knowledge, Unclutter Your Inbox
#
# ## Drowning in a Sea of Newsletters?
#
# Are you tired of wading through countless newsletters, struggling to find the
# information you need? Is your inbox cluttered with irrelevant content, making it
# difficult to stay up-to-date in today's fast-paced world? You're not alone. We
# know the feeling and that's why we created Newsletterify.
# ...
For an example on how to use LangChain with 2markdown, have a look at our use case »Extracting Structured News Events With LangChain And OpenAI Functions«.
👉 You can find more information in the official LangChain documentation.
This section will give you a brief overview of how to use 2markdown as a function with the OpenAI python library.
First, install the OpenAI python library with pip install openai
as well as the requests library to call 2markdown pip install requests
.
Let's do some preparations first:
# https://2markdown.com
md_api_key = "YOUR_KEY"
# https://openai.com
openai_api_key = "YOUR_KEY"
import openai
import json
import requests
openai.api_key = openai_api_key
Then, we can implement a function we'll later provide to OpenAI. It calls the 2markdown API with a provided URL:
# example: get_website_content("https://newsletterify.com")
def get_website_content(url):
res = requests.post("https://2markdown.com/api/2md", json={"url":url}, headers={"X-Api-Key":md_api_key})
if res.status_code == 200:
return res.json()['article']
return None
Now, we can call the function with OpenAI:
# define the function
functions = [
{
"name": "get_website_content",
"description": "Retrieve the content of any website in URL format",
"parameters": {
"type": "object",
"properties": {
"url": {
"type": "string",
"description": "The URL of the website, i.e. https://openai.com",
},
},
"required": ["url"],
},
}
]
# construct the prompt
messages = [{"role": "user", "content": "What's https://newsletterify.com about?"}] # <- this is our prompt
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo-0613",
messages=messages,
functions=functions,
function_call="auto", # auto is default, but we'll be explicit
)
response_message = response["choices"][0]["message"]
If OpenAI indeed decided to call the function, the response is quite meaningful at the moment. It just tells us that it would like to call the function
>>> print(response_message)
{
"content": null,
"function_call": {
"arguments": "{\n \"url\": \"https://newsletterify.com\"\n}",
"name": "get_website_content"
},
"role": "assistant"
}
In that case, let's fetch the actual response:
if response_message.get("function_call"):
# call the function
# Note: the JSON response may not always be valid; be sure to handle errors
available_functions = {
"get_website_content": get_website_content,
}
# only one function in this example, but you can have multiple
function_name = response_message["function_call"]["name"]
fuction_to_call = available_functions[function_name]
function_args = json.loads(response_message["function_call"]["arguments"])
function_response = fuction_to_call(
url=function_args.get("url"),
)
# send the info on the function call and function response to GPT
messages.append(response_message) # extend conversation with assistant's reply
messages.append(
{
"role": "function",
"name": function_name,
"content": function_response,
}
)
# extend conversation with function response
second_response = openai.ChatCompletion.create(
model="gpt-3.5-turbo-0613",
messages=messages,
)
print(second_response["choices"][0]["message"])
The response is now much more meaningful:
{
"content": "Newsletterify is a platform that aims to simplify the experience of managing and reading newsletters. It provides users with a dedicated email address for newsletter subscriptions and allows them to organize, customize, and schedule digests based on their topics and preferences. By uncluttering the inbox and streamlining the newsletter reading process, Newsletterify enables users to stay up-to-date with industry insights, learn from experts, and become thought leaders in their fields.",
"role": "assistant"
}
👉 You can find more information in the official OpenAI documentation about function calling with GPT. You'll also find the original code snippet.