Implement GEO on your website: llms.txt + Knowledge Graph + JSON-LD (technical guide)

This article is long and technical on purpose. It documents every piece of the Generative Engine Optimization stack I built for shinobis.com with PHP, from the llms.txt file to the automatic Knowledge Graph in JSON-LD. It is not theory. It is code running in production right now.

If you want the philosophy behind GEO read my other articles. If you want to implement it on your own site this is the article you need.

The GEO architecture of shinobis.com

The stack has four components that work together. First llms.txt which functions as a business card for AI crawlers. Second a semantic JSON-LD system that generates an automatic Knowledge Graph for each post based on its actual content. Third a content structure that maximizes citability. Fourth a network of crossposts distributed across authority platforms with canonical URLs.

Everything runs on vanilla PHP with MariaDB. No frameworks. No plugins. No external dependencies. I wrote every line of code with Claude as my development copilot.

Component 1: llms.txt

The llms.txt file lives at the domain root at shinobis.com/llms.txt. It is plain Markdown with a specific structure proposed by llmstxt.org. The H1 is just the site name. Below that goes a blockquote with the description. Then come sections with H2 headers for Author, Topics Covered, Content Policy, and Key Pages by language.

The technical part on Apache requires two things. A rule in .htaccess that says RewriteRule ^llms\.txt$ - [L] so the file is served directly without going through the PHP router. And a Files directive to serve it as text/plain with charset UTF-8.

In robots.txt I add the line LLMs-Txt: https://shinobis.com/llms.txt at the end. This follows the convention proposed by the standard so crawlers know the file exists.

I wrote everything in English because AI crawlers process in English regardless of your content language. I included links to the most important pages in all three languages with descriptions that give the model context about what each page contains.

Component 2: automatic Knowledge Graph with JSON-LD

This is the most complex component and the one with the most impact. Each post automatically generates a JSON-LD block with schema.org that includes three layers of semantic information beyond the basic BlogPosting.

The first layer is relatedLink. When I create a post and connect it with related posts from the admin panel, the system reads the post_connections table from the database and generates an array of semantically related URLs. The code queries the connections, gets the slugs translated to the current language, and builds the complete URLs. That tells AI models that my articles form a knowledge network, not isolated pages.

The second layer is about. The system scans the post content looking for predefined keywords like midjourney, prompt, fintech, ux design, llms.txt, and generative. When it detects a keyword it creates a typed entity as Thing with name and URL where appropriate. That tells the model what the main topics of the article are without relying solely on the title or H2s.

The third layer is mentions. Similar to about but for specific tools. When the content mentions Claude, ChatGPT, Figma, Midjourney, or Gemini, the system adds them as Thing entities with their name and URL. I used Thing instead of SoftwareApplication because SoftwareApplication tells Google your page is a software review and requires fields like offers, aggregateRating, and operatingSystem that do not apply when you are just mentioning the tool.

All of this is generated inside the generateBlogPostingSchema function that already existed for basic SEO. It is not a separate script. It is an extension of the schema I already had.

Component 3: citable content structure

There is no code for this. It is writing discipline. But documenting it is important because it is what makes the other components work.

Every paragraph in my articles is written to be extractable on its own. If an LLM takes a paragraph from my article and puts it in a response, that paragraph should make sense without the context of the rest of the article. That means writing direct statements with specific subject, verb, and object. Not starting with there are many ways to do it but with to implement an automatic Knowledge Graph in PHP you need a function that scans the post content, detects predefined keywords, and generates an array of typed entities in JSON-LD format.

I also use Sentiment Mapping. I write with experiential authority. I implemented this and it worked. I tested this and it failed. After months of experimenting I discovered that. That tone tells the LLM I am speaking from real experience, not speculation. And models give more weight to experiential evidence than opinion.

Component 4: distribution with canonical URLs

Each article is published first on shinobis.com as the canonical source. Then it is crossposted to high-authority platforms like HackerNoon, Dev.to, Hashnode, and Medium. Each crosspost includes a canonical URL pointing to the original article on my site.

For LLMs this creates a distributed authority signal. The same author appears across multiple high-reputation sources talking about the same topics. That reinforces the authority profile the model builds internally.

For Google the crossposts generate backlinks that increase Domain Authority. An article on HackerNoon with a canonical to shinobis.com is a dofollow backlink from a DA 90+ domain. Each crosspost is a one-hour investment that pays dividends in both worlds.

How to measure if GEO works

There are no direct metrics. There is no Google Analytics for ChatGPT responses. What I can measure is indirect traffic from people who arrived at my site after seeing an AI response that mentioned me. I can also test directly by asking different LLMs about my topics and checking if they cite me.

Before implementing the GEO stack no LLM cited me when asked about designers using AI in fintech. After implementation I started appearing in responses consistently. I do not have exact numbers because nobody does. But the difference is observable.

What I can measure with Google Search Console is the impact of the Knowledge Graph on traditional SEO. Posts with complete semantic schema have more impressions than those with only basic BlogPosting. Rich snippets appear more frequently. And CTR improves because Google better understands what each page is about.

What I would implement if starting today

If I had to implement GEO on a new site from scratch I would do it in this order. First the citable content structure because it requires no code and has the most impact. Second llms.txt because it takes one afternoon and the cost is zero. Third the Knowledge Graph with JSON-LD because it requires development but the return is high. Fourth the platform distribution because it requires constant time but each crosspost accumulates value.

You do not need PHP. You can do the same with any language or CMS. What matters are the concepts: citable content, semantic graph, distributed presence. The implementation is secondary.

All the code I use is in production at shinobis.com. It is not experimental. It is not conceptual. It is real and it works.