Skip to main content
AI Business, Funding & Market

Wikidata

A free, machine-readable knowledge database operated by the Wikimedia Foundation that assigns a unique Q-ID to every entity and publishes all data under the CC0 license

#wikidata#knowledge graph#Q-ID#structured data#entity SEO#linked data

What is Wikidata?

Wikidata is a structured knowledge database launched in 2012 by the Wikimedia Foundation — the same organization that runs Wikipedia. If Wikipedia is an encyclopedia for humans to read, Wikidata is an encyclopedia for machines to read.

According to Wikidata's official introduction, every entity (person, organization, place, concept) receives a unique Q-ID (for example, Q95 is Google), and all data is published under the CC0 license (public domain), meaning it can be freely used without attribution.

Why does it matter for LLMs?

Two properties make Wikidata structurally central to the LLM era.

Unique identifiers. The Q-ID cleanly resolves homonym ambiguity. Whether "Apple" means the tech company (Q312) or the fruit (Q89), the Q-ID makes the distinction unambiguous for machines.

Free license. Most major LLM training datasets include Wikidata directly or indirectly, without copyright friction. A brand registered in Wikidata gains a measurable advantage in pre-training coverage.

Additionally, the Google Knowledge Graph uses Wikidata as a core data source. Improving a Wikidata entry can propagate through the chain: Wikidata → Google Knowledge Graph → Google AI Overview → Gemini answers.

Data structure: Triples

Wikidata represents knowledge in triples (subject–predicate–object):

RanketAI (subject) — instance of (predicate) — software as a service (object)

This relational structure is what allows LLMs to determine "which entities are representative in a given category" — something that flat document text cannot convey as precisely.

Key Wikidata properties for brands

Property ID Example
instance of P31 software as a service
country P17 South Korea
official website P856 https://www.ranketai.com
Twitter username P2002 ranket_ai
industry P452 software industry

Related Terms

Related terms