Design of a preview meta tag system for links in chat
When sharing a link in a chat, it is desirable to have a beautiful preview of the link: a small text and an image. How can we most effectively get link previews for users who share links in chat?
^ A beautiful example ^
The first thought that comes to mind is to let each user device (browser or mobile application) access the site from a link, download the content, parse the HTML, retrieve the image from there, and insert it into the chat. This is a very poor implementation. Why? It is too expensive from the user's point of view: it consumes a lot of traffic, drains the device's battery, and ultimately increases the complexity of the mobile application, which should not be solving such tasks.
What option to prefer? The most obvious solution would be to implement something similar on the backend.
Let's assume that our chat is implemented using web sockets.
So, approximate implementation steps.
#0 We receive a message and check it for links.Â
We abstract a little from the technical implementation of communication between the server and client, as it will only confuse and not allow to delve into the essence of designing such a system.Â
What we need to consider is that we must definitively understand whether there is a link in this message or not.Â
This is necessary for two things: for post-processing the message and for signaling to the client that the message will need to be additionally processed on the client side after post-processing on the server.
So, we need to parse the messages and check if there are any links. The easiest way to do this is by using a regular expression.
$message = "Hello, check awesome web server at https://www.nginx.com/";
preg_match_all('/\bhttps?:\/\/\S+/', $text, $matches);
This regular expression is cool, it is specifically optimized for searching links. Trust me!
#1 Link storage
We will need approximately this set of tables:
user message table (
messages
)"general" link storage (
links_storage
)message link binding table (
links_relation
)
# messages
| id | message_text
# links_storage
| id | link | title | description | image_preview | processed
# links_relation
| id | message_id | storage_id
If we found links in the message, we need to:
Put the user's message in the
messages
table.If the URL link is not found in the
links_storage
table, then add the URL to this table with the flag "processed = 0
".Create a record in the
links_relation
table in the form of:message_id
from themessages
table andstorage_id
from thelinks_storage
table.
Now we have information about whether there is a link in this message and whether we have a preview, text, and description for this link. If processed == 0
(and we assume that this is the case, then we need to move on to the next item).
#2 Post-processing: Obtaining the title, description text and image for the link.
So, let's assume that we have received a message from the client, stored it in the database for post-processing, and know that this link needs to be processed.
The most optimal solution would be to create workers using, for example, supervisord or even cron. The code for the workers can be written in any language, but our example will be in PHP. These workers will check the table of links that need to be processed and will take the links from there. As post-processing, we mean obtaining the title, image, and description text.
So, let's assume that we have launched a worker that received data from the post-processing table:
select * from links_storage where processed = 0
Next, our worker retrieves the links themselves and obtains metadata from them.
function getLinkPreview($url) {
$tags = get_meta_tags($url);
$title = isset($tags['og:title']) ? $tags['og:title'] : $tags['title'];
$description = isset($tags['og:description']) ? $tags['og:description'] : $tags['description'];
$imagePreview = isset($tags['og:image']) ? $tags['og:image'] : '';
return array(
'title' => $title,
'description' => $description,
'image_preview' => $imagePreview
);
}
$link = 'https://www.nginx.com/'; #let's assume that we got it from the query above
$preview = getLinkPreview($link);
EVERYTHING IS READY!
Next, we simply insert the obtained $preview
data into the links_storage
table with the processed = 1
flag.
#3 Providing customer data
So, since we have abstracted away the details of how client-server communication works, we will assume that communication happens via web sockets. We send a message to the client indicating that a message with a certain id
has received a set of metadata for links, and they can be displayed. The client simply displays this data in the chat as-is, retrieving it from our tables.
What are the advantages of such implementation?
The client saves traffic.
The client application remains simple and does not engage in activities it should not be engaged in.
The client saves battery power and device processing power.
We do not overload resources unnecessarily as the request comes from the worker on the backend, not from each client device.
If the link has been encountered twice, it is already in our tables with the
processed = 1
flag and we can return its metadata without even making post-processing
If you like my article and want to receive more material, please subscribe or share the link with your friends on social networks! If you want to add to the post, write comments