To understand the behavior of Semji's content extractor
You have created a Draft from existing content (Updated Content) and you see that Semji has not retrieved all the editorial content?
Some of your online content is missing in the editor?
Understanding the default extractor
IMPORTANT : Semji uses a content extractor robot. This extractor is configured to fit most of our users' website structures.
On most CMS (your website manager), with classic layouts, our extractor efficiently detects the share of editorial content in the page. It imports Hn title tags (H1, H2, H3...), text paragraphs, internal links within paragraphs, as well as the formatting (bold, italic...) you used.
Our extractor is set to exclude elements such as menus, footers, sidebar modules... that are not covered by Semji's recommendations.
It is possible that, when dealing with more complex templates, layouts or layouts, the extracted content is partial. This can be the case on pages such as product sheets, or service pages made up of different blocks, with text located after several scrolls for example.
It is important to know that some sites block access to our robots and make it difficult to read the content. Some UX display choices can also limit text retrieval within the HTML code of your page.
Furthermore, it is very important to note that the Semji Bot does not carry out the Javascript code within your pages. This means that if your content is only displayed after the Javascript is executed and is not present in the DOM at loading time, the extractor may only extract it partially or not at all.
When does Semji automatically retrieve the text of my content?
This automatic recovery using the default extractor takes place when you turn an existing content into a Draft, i.e. an editable version of your content.
You can easily see if you are on the online version of the content (not editable) or on the Draft version (editable) thanks to the pictograms located at the top of your menu bar.
Click on the pencil icon to turn the visible online content into editable content and create a Draft in Semji.
Semji will then suggest that you start optimizing your content: "Start Optimizing". By clicking on this button, you create the draft and start the automatic retrieval of your text. Semji requires a few seconds by displaying the "Retrieving Content" mention. Meanwhile, your content and its formatting are retrieved and pasted into the editor.
Note: this action does not spend any analysis credit. It simply retrieves all your content to save you time. The analysis credit will be spent when you choose your Focus Keyword in the right sidebar and click on the "Start analysis" button.
Using copy and paste to complete the text
When you are faced with a borderline case, you can complete the missing editorial text after the automatic retrieval with a manual copy and paste operation in the editor. The usual keyboard shortcuts (ctrl + C and ctrl + V) work. Semji then automatically retrieves your title markup and formatting.
Thanks to this, you show Semji more text that will be taken into account in the calculation of your Content Score and you can improve it by applying the recommendations.
Get a personalized extractor with the Custom offer
Note that if you are a subscriber to the Custom plan, you can opt for a custom content extractor. This one adapts to your main templates and page structures and takes the place of the default extractor. It is set manually by our teams after discussion with you about the constraints and specificities of your content. This more precise extractor can save you time and avoid repeated copy and paste operations.