Difference between revisions of "Semantic MediaWiki"

From Freephile Wiki
Jump to navigation Jump to search
(Add Codex)
(re-order, and group under ea. day heading)
Line 6: Line 6:
 
The [https://www.semantic-mediawiki.org/wiki/SMWCon_Fall_2023 3-day program] was fantastic!
 
The [https://www.semantic-mediawiki.org/wiki/SMWCon_Fall_2023 3-day program] was fantastic!
  
One major advancement was the fact that Bernard Krabina opened ties with [https://docs.opencollective.com/help/about/introduction Open Collective] so that individuals and organizations can [https://www.semantic-mediawiki.org/wiki/Sponsorship#Donating_money donate money] to the project.
+
One major announcement is that through [https://docs.opencollective.com/help/about/introduction Open Collective] individuals and organizations can [https://www.semantic-mediawiki.org/wiki/Sponsorship#Donating_money donate money] to the Semantic MediaWiki project.
  
===Task tracking===
+
=== Day One ===
 +
 
 +
=== Day Two ===
 +
 
 +
====Major changes on interfaces of MediaWiki RDBMS library====
 +
https://www.mediawiki.org/wiki/Manual:Database_access
 +
 
 +
=== Day Three ===
 +
 
 +
====Open Semantic Lab====
 +
[https://github.com/OpenSemanticLab Open Semantic Lab] starts with the premise that '''Ontologies'''<ref>See Simon's https://github.com/General-Process-Ontology/ontology</ref> are key to standardize '''''everything'''''... but '''tools''' are needed to make ontologies '''''applicable''''' in everyday research. The OSL is the holistic and community driven platform to fulfill this roll... and links '''people''' (knowledge), '''machines''' (data) and '''algorithms''' (AI) '''equally.'''
 +
 
 +
Since last year, the project has been completely based on the industry standards of JSON-SCHEMA and JSON-LD, enabling new applications quickly with easy integration to any third party software. Experimental support has been added to achieve Python triggered workflows through REST APIs and LocalGPT Q&A + search assistance.
 +
 
 +
The whole subject is quite advanced, so it can be hard to wrap your head around it. A good way to understand the power of the Open Semantic Lab system is to look at an example use case where it was put into practice<ref>This was in 2022, so before the new version. The project was shown in the presentation at https://www.youtube.com/watch?v=MZlk5Gzy0tc&t=1564s</ref>. At https://kiprobatt.de/wiki, they illustrate ''Intelligent battery cell manufacturing.'' The [https://kiprobatt.de/wiki/Parameter_correlations/High_priority diagramming capabilities of the system are impressive] - showing interactive (clickable) node graphs plus Draw.io integration. The underlying software application is not for the faint-of-heart. Checking the [https://kiprobatt.de/wiki/Special:Version Special:Version] page shows an extensive list of complex MediaWiki and Semantic MediaWiki extensions. The latest version's [https://opensemantic.world/w/index.php?title=Item%3AOSWdb485a954a88465287b341d2897a84d6&reveal=true&useskin=timeless#/Technology_Stack technology stack is represented here].
 +
 
 +
'''See also:''' [https://opensemantic.world/wiki/Main_Page https://opensemantic.world/] - a reference deployment of the '''OpenSemanticWorld''' packages. Here is a [https://opensemantic.world/w/index.php?title=Item%3AOSWdb485a954a88465287b341d2897a84d6&reveal=true&useskin=timeless#/What_is_different_to_Vanilla_.28Semantic.29_MediaWiki.3F brief explanation] of the key differences from 'vanilla' Semantic MediaWiki<ref>The dynamic "slide format" of page content is also impressive.</ref>. You can see at [https://opensemantic.world/w/index.php/Special:Version Special:Version], the extensions and software components.
 +
 
 +
====Task tracking====
 
HalloWelt! combines four extensions they created to make useful task tracking in (Semantic) MediaWiki
 
HalloWelt! combines four extensions they created to make useful task tracking in (Semantic) MediaWiki
  
Line 18: Line 36:
 
Miriam Schlindwein presented how it's possible to create tasks, assign them to someone, add due dates and how they can be controlled  {{#ev:youtube|lYpi08dqBPs|||||t=13336}}
 
Miriam Schlindwein presented how it's possible to create tasks, assign them to someone, add due dates and how they can be controlled  {{#ev:youtube|lYpi08dqBPs|||||t=13336}}
  
===Realtime integrations with GitLab===
+
====Realtime integrations with GitLab====
 
See [[GitLab operations]]
 
See [[GitLab operations]]
  
===Fixing Wikidata===
+
====Natural Language Queries to Wikidata: A Naïve Prototype====
Yaron Koren gave a great presentation ([https://commons.wikimedia.org/wiki/File:Fixing_Wikidata_-_SMWCon_2023.pdf slides]) called '''[http://wikiworks.com/enhanced-wikibase.html Enhanced Wikibase]''' on how [[Wikibase]] (and therefore Wikidata) are missing features. He showed how he implemented these missing features in a series of developments. One is showcased at [https://wikidatawalkabout.org/ Wikidata Walkabout] - a drill-down and query interface to Wikibase sites; powered by [https://github.com/sahajsk21/Anvesha Anvesha] - a JavaScript library. [https://www.youtube.com/live/lYpi08dqBPs?si=ci0swXD2e-e7qCwy&t=19983 Video presentation]
 
 
 
===Natural Language Queries to Wikidata: A Naïve Prototype===
 
 
[[File:Architecture - Ask Wikidata SMWCon 2023.png|alt=Application architecture|thumb|architecture]] Robert Timms - Sr. Software Engineer Wikibase Suite, Wikimedia Deutschland gave [https://www.semantic-mediawiki.org/wiki/SMWCon_Fall_2023/Natural_Language_Queries_to_Wikidata:_A_Na%C3%AFve_Prototype a talk] ([https://github.com/rti/askwikidata code] [https://docs.google.com/presentation/d/1YgDmcvoXaqnYdRyX5RxewVkeioEJ92nb8Sfb_halBsM slides] [https://colab.research.google.com/drive/1yRZshpNj0kXwY0XuUYw5ziqjw_RffxH- try it]) about querying Wikibase with an LLM. Slides 9-22 go from the application architecture to the 'tada' moment.
 
[[File:Architecture - Ask Wikidata SMWCon 2023.png|alt=Application architecture|thumb|architecture]] Robert Timms - Sr. Software Engineer Wikibase Suite, Wikimedia Deutschland gave [https://www.semantic-mediawiki.org/wiki/SMWCon_Fall_2023/Natural_Language_Queries_to_Wikidata:_A_Na%C3%AFve_Prototype a talk] ([https://github.com/rti/askwikidata code] [https://docs.google.com/presentation/d/1YgDmcvoXaqnYdRyX5RxewVkeioEJ92nb8Sfb_halBsM slides] [https://colab.research.google.com/drive/1yRZshpNj0kXwY0XuUYw5ziqjw_RffxH- try it]) about querying Wikibase with an LLM. Slides 9-22 go from the application architecture to the 'tada' moment.
  
Line 38: Line 53:
 
{{Notice|The 'gpt' in ChatGPT stands for "Generative Pre-trained Transformer" - or a fancy way to say "guess". The '''artificial''' intelligence of large language model GPTs '''guess''' what you would say next based on the prompt given and the dataset they are trained on. In OpenAI's own words: "Generative AI models formulate responses by matching patterns or words, while RAG systems retrieve data based on similarity of meaning or semantic searches."}}
 
{{Notice|The 'gpt' in ChatGPT stands for "Generative Pre-trained Transformer" - or a fancy way to say "guess". The '''artificial''' intelligence of large language model GPTs '''guess''' what you would say next based on the prompt given and the dataset they are trained on. In OpenAI's own words: "Generative AI models formulate responses by matching patterns or words, while RAG systems retrieve data based on similarity of meaning or semantic searches."}}
  
===Major changes on interfaces of MediaWiki RDBMS library===
+
====Fixing Wikidata====
https://www.mediawiki.org/wiki/Manual:Database_access
+
Yaron Koren gave a great presentation ([https://commons.wikimedia.org/wiki/File:Fixing_Wikidata_-_SMWCon_2023.pdf slides]) called '''[http://wikiworks.com/enhanced-wikibase.html Enhanced Wikibase]''' on how [[Wikibase]] (and therefore Wikidata) are missing features. He showed how he implemented these missing features in a series of developments. One is showcased at [https://wikidatawalkabout.org/ Wikidata Walkabout] - a drill-down and query interface to Wikibase sites; powered by [https://github.com/sahajsk21/Anvesha Anvesha] - a JavaScript library. [https://www.youtube.com/live/lYpi08dqBPs?si=ci0swXD2e-e7qCwy&t=19983 Video presentation]
 
 
===Open Semantic Lab===
 
[https://github.com/OpenSemanticLab Open Semantic Lab] starts with the premise that '''Ontologies'''<ref>See Simon's https://github.com/General-Process-Ontology/ontology</ref> are key to standardize '''''everything'''''... but '''tools''' are needed to make ontologies '''''applicable''''' in everyday research. The OSL is the holistic and community driven platform to fulfill this roll... and links '''people''' (knowledge), '''machines''' (data) and '''algorithms''' (AI) '''equally.'''
 
 
 
Since last year, the project has been completely based on the industry standards of JSON-SCHEMA and JSON-LD, enabling new applications quickly with easy integration to any third party software. Experimental support has been added to achieve Python triggered workflows through REST APIs and LocalGPT Q&A + search assistance.
 
 
 
The whole subject is quite advanced, so it can be hard to wrap your head around it. A good way to understand the power of the Open Semantic Lab system is to look at an example use case where it was put into practice<ref>This was in 2022, so before the new version. The project was shown in the presentation at https://www.youtube.com/watch?v=MZlk5Gzy0tc&t=1564s</ref>. At https://kiprobatt.de/wiki, they illustrate ''Intelligent battery cell manufacturing.'' The [https://kiprobatt.de/wiki/Parameter_correlations/High_priority diagramming capabilities of the system are impressive] - showing interactive (clickable) node graphs plus Draw.io integration. The underlying software application is not for the faint-of-heart. Checking the [https://kiprobatt.de/wiki/Special:Version Special:Version] page shows an extensive list of complex MediaWiki and Semantic MediaWiki extensions. The latest version's [https://opensemantic.world/w/index.php?title=Item%3AOSWdb485a954a88465287b341d2897a84d6&reveal=true&useskin=timeless#/Technology_Stack technology stack is represented here].
 
 
 
'''See also:''' [https://opensemantic.world/wiki/Main_Page https://opensemantic.world/] - a reference deployment of the '''OpenSemanticWorld''' packages. Here is a [https://opensemantic.world/w/index.php?title=Item%3AOSWdb485a954a88465287b341d2897a84d6&reveal=true&useskin=timeless#/What_is_different_to_Vanilla_.28Semantic.29_MediaWiki.3F brief explanation] of the key differences from 'vanilla' Semantic MediaWiki<ref>The dynamic "slide format" of page content is also impressive.</ref>. You can see at [https://opensemantic.world/w/index.php/Special:Version Special:Version], the extensions and software components.
 
  
=== Codex, the Design System for Wikimedia ===
+
====Codex, the Design System for Wikimedia====
  
* [https://www.youtube.com/live/lYpi08dqBPs?si=nfOmDAu0AvuVQEcq&t=23206 Video link on YouTube]  
+
*[https://www.youtube.com/live/lYpi08dqBPs?si=nfOmDAu0AvuVQEcq&t=23206 Video link on YouTube]
* [https://docs.google.com/presentation/d/14KunarL34ImfnF5AB8v7BbFK71LOXHALBQZfBxmyOBk/edit#slide=id.g23e4b7f11b0_0_1162 Slides]
+
*[https://docs.google.com/presentation/d/14KunarL34ImfnF5AB8v7BbFK71LOXHALBQZfBxmyOBk/edit#slide=id.g23e4b7f11b0_0_1162 Slides]
  
 
The Wikimedia '''[https://doc.wikimedia.org/codex/latest/ Codex]''' design system is analogous to Google's '''[https://m3.material.io/ Material Design]''', Shopify's '''[https://polaris.shopify.com/ Polaris]''', or IBM's '''[https://carbondesignsystem.com/ Carbon]''' {{References}}
 
The Wikimedia '''[https://doc.wikimedia.org/codex/latest/ Codex]''' design system is analogous to Google's '''[https://m3.material.io/ Material Design]''', Shopify's '''[https://polaris.shopify.com/ Polaris]''', or IBM's '''[https://carbondesignsystem.com/ Carbon]''' {{References}}

Revision as of 12:30, 21 December 2023

Semantic MediaWiki is one of the largest, and most complex extensions to MediaWiki - and also an indespensible one for enterprise use. The features it provides are partly described on the Metadata page.

This page exists to dive deeper into particulars.

SMWCon 2023[edit | edit source]

The 3-day program was fantastic!

One major announcement is that through Open Collective individuals and organizations can donate money to the Semantic MediaWiki project.

Day One[edit | edit source]

Day Two[edit | edit source]

Major changes on interfaces of MediaWiki RDBMS library[edit | edit source]

https://www.mediawiki.org/wiki/Manual:Database_access

Day Three[edit | edit source]

Open Semantic Lab[edit | edit source]

Open Semantic Lab starts with the premise that Ontologies[1] are key to standardize everything... but tools are needed to make ontologies applicable in everyday research. The OSL is the holistic and community driven platform to fulfill this roll... and links people (knowledge), machines (data) and algorithms (AI) equally.

Since last year, the project has been completely based on the industry standards of JSON-SCHEMA and JSON-LD, enabling new applications quickly with easy integration to any third party software. Experimental support has been added to achieve Python triggered workflows through REST APIs and LocalGPT Q&A + search assistance.

The whole subject is quite advanced, so it can be hard to wrap your head around it. A good way to understand the power of the Open Semantic Lab system is to look at an example use case where it was put into practice[2]. At https://kiprobatt.de/wiki, they illustrate Intelligent battery cell manufacturing. The diagramming capabilities of the system are impressive - showing interactive (clickable) node graphs plus Draw.io integration. The underlying software application is not for the faint-of-heart. Checking the Special:Version page shows an extensive list of complex MediaWiki and Semantic MediaWiki extensions. The latest version's technology stack is represented here.

See also: https://opensemantic.world/ - a reference deployment of the OpenSemanticWorld packages. Here is a brief explanation of the key differences from 'vanilla' Semantic MediaWiki[3]. You can see at Special:Version, the extensions and software components.

Task tracking[edit | edit source]

HalloWelt! combines four extensions they created to make useful task tracking in (Semantic) MediaWiki

Miriam Schlindwein presented how it's possible to create tasks, assign them to someone, add due dates and how they can be controlled

Realtime integrations with GitLab[edit | edit source]

See GitLab operations

Natural Language Queries to Wikidata: A Naïve Prototype[edit | edit source]

Application architecture
architecture

Robert Timms - Sr. Software Engineer Wikibase Suite, Wikimedia Deutschland gave a talk (code slides try it) about querying Wikibase with an LLM. Slides 9-22 go from the application architecture to the 'tada' moment.


Not the goal of the talk, but he revealed some of the key drawbacks of using "AI" in the first place:

  1. Outdated information
  2. Prone to hallucinations
  3. No sources (AI doesn't tell you how or why it claims to be authoritative.)

This is supposed to be addressed in part by using the RAG technique.

Fixing Wikidata[edit | edit source]

Yaron Koren gave a great presentation (slides) called Enhanced Wikibase on how Wikibase (and therefore Wikidata) are missing features. He showed how he implemented these missing features in a series of developments. One is showcased at Wikidata Walkabout - a drill-down and query interface to Wikibase sites; powered by Anvesha - a JavaScript library. Video presentation

Codex, the Design System for Wikimedia[edit | edit source]

The Wikimedia Codex design system is analogous to Google's Material Design, Shopify's Polaris, or IBM's Carbon == References ==

  1. See Simon's https://github.com/General-Process-Ontology/ontology
  2. This was in 2022, so before the new version. The project was shown in the presentation at https://www.youtube.com/watch?v=MZlk5Gzy0tc&t=1564s
  3. The dynamic "slide format" of page content is also impressive.